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Abstract 


The  scope  of  this  project  addresses  the  problem  of  tracking  a  dim  moving  point  target  from  a  sequence 
of  hyperspectral  cubes.  The  resulting  tracking  algorithm  is  useful  for  many  staring  technologies  such 
as  the  ones  used  in  space  surveillance  and  missile  tracking  applications.  In  these  applications,  the 
images  consist  of  targets  moving  at  sub-pixel  velocity  and  noisy  background  consisting  of  evolving 
clutter  and  noise.  The  demand  for  a  low  false  alarm  rate  {FAR)  on  one  hand  and  a  high  probability  of 
detection  {Pd)  on  the  other  makes  the  tracking  a  challenging  task.  The  use  of  hyperspectral  images 
should  be  superior  to  current  technologies  using  broadband  IR  images  due  to  the  ability  of  exploiting 
simultaneously  two  target  specific  properties:  the  spectral  target  characteristics  and  the  time  dependent 
target  behavior. 

The  proposed  solution  consists  of  three  stages:  the  first  stage  transforms  the  hyperspectral  cubes  into  a 
two  dimensional  sequence,  using  known  point  target  detection  acquisition  methods;  the  second  stage 
involves  a  temporal  separation  of  the  2D  sequence  into  sub-sequences  and  the  usage  of  a  variance 
filter  {VF)  to  detect  the  presence  of  targets  from  the  temporal  profile  of  each  pixel  in  each  group,  while 
suppressing  clutter  specific  influences.  This  stage  creates  a  new  sequence  containing  a  target  with  a 
seemingly  faster  velocity;  the  third  stage  applies  the  Dynamic  Programming  Algorithm  {DPA)  that 
proves  to  be  a  very  effective  algorithm  for  the  tracking  of  moving  targets  with  low  SNR  at  around 
pixel  velocity. 

The  system  is  tested  on  both  synthetic  and  real  data. 
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1  Introduction 

The  goal  of  the  research  in  the  scope  of  the  project  is  to  propose  a  unique  system  for  the  tracking  of 
dim  point  targets  moving  at  sub-pixel  velocities  in  a  sequence  of  hyperspectral  cubes  or,  simply  put,  a 
hyperspectral  movie.  Our  research  incorporates  algorithms  from  two  different  areas  -  target  detection 
in  hyperspectral  imagery  and  target  tracking  in  IR  sequences.  Numerous  works  have  addressed  each  of 
these  problems  separately,  but  to  the  best  of  our  knowledge  no  attempts  were  made,  so  far,  to  combine 
the  two  fields. 

We  chose  the  most  intuitive  approach  to  tackle  the  problem,  namely  divide  and  conquer  -  separation 
of  the  problem  into  three  sub-problems  and  solving  each  one  separately  using  a  combination  of 
approaches.  Thus,  we  first  transform  each  hyperspectral  cube  into  a  two  dimensional  image  using  a 
hyperspectral  target  detection  method.  The  next  step  involves  a  temporal  separation  of  the  movie 
(images  sequence)  into  sub  movies  and  the  usage  of  a  variance  filter  ( VF ).  The  filter  detects  the 
presence  of  targets  from  the  temporal  profile  of  each  pixel  in  each  group  while  suppressing  clutter 
specific  influences.  Afterwards,  an  accumulation  of  the  results  from  several  consecutive  images  is 
done  using  a  Track  Before  Detect  ( TBD )  approach,  implemented  by  the  Dynamic  Programming 
Algorithm  ( DPA ),  to  perform  detection  in  the  time  domain.  Performance  metrics  are  defined  for  each 
step,  and  are  used  in  the  analysis  and  optimization  of  each  step. 

In  order  to  evaluate  the  complete  system,  we  need  to  obtain  a  hyperspectral  movie.  Since  this  kind  of 
data  is  not  available  to  us  yet,  an  algorithm  is  developed  for  the  creation  of  a  hyperspectral  movie, 
based  on  a  real  world  IR  sequence  and  real-world  signatures,  including  an  implanted  synthetic  moving 
target. 

This  paper  is  organized  as  follows:  Sections  2  and  3  provide  a  short  overview  of  the  basic  methods  for 
target  detection  in  hyperspectral  imagery  and  the  time  and  spatial  processing  methods  for  target 
tracking  in  IR  imagery  on  which  we  base  our  algorithm.  Section  4  presents  the  suggested  system. 
Section  5  presents  the  evaluation  of  the  full  system  on  real  and  synthetic  data.  Section  6  ends  the 
report  describing  conclusions  and  ideas  for  future  research. 
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2  Point  target  detection  in  hyperspectral  imagery 

2. 1  Hyperspectral  imagery  overview 

2.1.1  Hyperspectral  imagery  basics 

Hyperspectral  imagery  (HSI)  exploits  the  fact  that  different  materials  reflect,  absorb  and  emit 
electromagnetic  radiation  in  ways  characteristic  of  their  molecular  composition  and  structure  [1]  .  The 
radiation  arriving  at  the  hyperspectral  sensor  is  measured  at  each  wavelength  over  a  sufficiently  broad 
spectral  band,  and  the  resulting  spectral  signature,  or  simply  spectrum,  can  be  used  to  uniquely 
characterize  and  identify  any  given  material. 

Hyperspectral  sensors  represent  an  advance  on  technology  from  the  earlier  multispectral  sensors, 
which  were  able  to  collect  information  in  only  a  few  discrete  and  noncontiguous  bands.  Current 
hyperspectral  sensors  sample  the  reflective  portion  of  the  electromagnetic  spectrum  that  extends  from 
the  visible  region  (0.4  -  0.7  /urn)  through  the  near  infrared  (about  2.4  pm)  in  hundreds  of  narrow 
contiguous  bands  about  10  nm  wide.  Other  types  of  hyperspectral  sensors  exploit  the  emissive 
properties  of  materials  by  collecting  data  in  the  mid-wave  and  long-wave  infrared  regions  of  the 
spectrum.  From  a  spatial  point  of  view,  current  sensors  are  capable  of  spatial  sampling  (or  ground 
pixel  size)  in  the  resolution  of  several  meters  to  tens  of  meters,  depending  on  the  sensor  aperture  and 
platform  altitude  (mainly  space  borne  vs.  airborne  sensor).  For  example,  the  sensor  system  called 
Hyperion,  carried  on  the  experimental  NASA  EO-1  spacecraft  launched  in  November  2000,  has  220 
spectral  bands,  30  m  pixels  and  10  bit  data  system  [2] . 


The  spatially  and  spectrally  sampled  information  can  be  described  as  a  data  cube,  whose  face  is  a 
function  of  the  spatial  coordinates  and  depth  is  a  function  of  spectral  band,  as  shown  in  Figure  1. 
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There  are  many  effects  which  influence  the  data  collected  at  the  sensor,  besides  of  course  the  materials 
themselves.  Some  of  these  are: 

■  The  angle  of  the  sun. 

■  The  viewing  angle  of  the  sensor. 

■  The  upwelling  solar  radiance  from  atmospheric  scattering. 

■  The  secondary  illumination  of  the  material  by  light  reflected  from  adjacent  objects  in  the  scene. 

■  Shadowing. 

■  The  scattering  and  absorption  of  the  reflected  radiance  by  the  atmosphere. 

■  Spatial  and  spectral  aberrations  in  the  sensor. 

All  these  make  the  processing  of  the  hyperspectral  data  a  challenging  task  requiring  sophisticated 
signal  processing  algorithms  and  models. 


2.1.2  The  basic  steps  of  hyperspectral  signal  processing 


The  basic  hyperspectral  processing  chain  is  given  in  Figure  2.  There  are  two  important  preprocessing 
steps  which  should  be  performed  before  the  actual  processing  of  the  data  begins.  These  are  shortly 
described  in  the  following  subsections. 
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Figure  2:  Illustration  of  hyperspectral  imagery  processing  chain. 
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2.1.2. 1  Atmospheric  compensation 

The  first  preprocessing  step  is  atmospheric  compensation  or  atmospheric  calibration.  This  step  is 
needed  in  order  to  correct  the  data  for  the  effects  of  atmospheric  absorption  and  scattering. 

The  atmosphere  absorbs  and  scatters  light  in  a  wave-length-dependent  fashion  [3]  .  Thus,  the  “raw” 
data  from  the  sensor  cannot  be  directly  compared  to  either  laboratory  spectra  or  “raw”  spectra 
collected  at  other  times  or  places.  In  order  to  apply  signal  detection  algorithm,  the  reflectance  or 
emissive  spectra  of  the  objects  of  interest  must  be  converted  into  radiance  at  the  sensor,  or  the  radiance 
data  collected  by  the  sensor  must  be  converted  into  reflectance  or  emission.  This  process  is  known  as 
atmospheric  compensation. 

The  simplest  method  to  compensate  for  the  environmental  and  atmospheric  effects  is  to  place  a 
calibration  panel,  with  known  reflectance  in  the  scene,  in  an  open  area,  and  use  the  observed  radiance 
spectrum  from  the  panel  to  develop  gain  and  offset  corrections  for  each  waveband  of  interest.  Other 
more  sophisticated  methods  are  the  physics-based  modeling  methods.  These  include  processes  to 
quantify  the  effect  of  water  vapor  on  the  measurements,  estimates  of  terrain  height  and  aerosol  optical 
depth  (visibility). 
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2.1.2.2  Dimensionality  reduction 

The  next  important  step  is  dimensionality  reduction.  In  most  cases,  hyperspectral  sensors  over  sample 
the  spectral  signal  to  ensure  that  any  narrow  features  are  adequately  represented.  Dimensional 
reduction  is  often  desirable  because  it  leads  to  significant  reductions  in  computational  complexity  and 
reduction  of  the  number  of  pixels  required  to  obtain  statistical  estimates  (the  number  of  pixels  required 
to  obtain  statistical  estimate  with  a  given  accuracy  increases  drastically  with  the  dimensionality  of  the 
data).  Since  the  dimensional  reduction  involves  data  loss,  it  is  important  to  make  sure  that  high-quality 
features  needed  for  detection,  discrimination  and  classification  of  the  data  are  preserved. 

The  most  common  algorithm  for  dimensionality  reduction  is  principal  component  analysis  ( PCA ) 
which  is  based  on  the  Karhunen  -  Loeve  linear  transformation  [4]  .  Further  development  of  that 
transformation  can  be  done  using  a  kernel,  a  method  known  as  K-PCA,  which  is  a  non-linear 
transformation. 

2.1.3  Applications 

The  number  and  variety  of  civilian  and  military  applications  for  hyperspectral  remote  sensing  is 
enormous.  The  majority  of  algorithms  used  in  these  applications  might  be  roughly  divided  into  four 
main  classes: 

1.  Target  detection  -  defined  as  searching  for  the  pixels  of  the  hyperspectral  data  cube  which 
exhibit  similar  spectral  properties  to  the  desired  spectral  signature  (in  case  the  desired  signature 
is  known)  or  pixels  which  differ  significantly  from  their  surrounding  (anomaly  detection)  [1]  , 
[5], 

2.  Change  detection  -  defined  as  finding  the  predefined  significant  changes  between  two 
hyperspectral  scenes  of  the  same  geographic  region  [1] . 

3.  Classification  -  assigning  a  label  (class)  to  each  pixel  of  the  hyperspectral  data  cube  [1]  . 

4.  Spectral  unmixing  -  estimating  the  fraction  of  each  material  in  a  given  pixel  [6]  . 

This  research  focuses  on  the  subject  of  target  detection  and  tracking,  which  will  be  further  detailed  in 
the  next  section. 
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2.2  Algorithms  for  target  detection  in  hyperspectral  imagery 

2.2.1  Problem  Statement 

In  target  detection  applications,  the  goal  is  to  search  the  pixels  of  a  hyperspectral  data  cube  for  the 
presence  of  a  specific  material  (target).  In  surveillance  applications,  the  size  of  the  objects  (targets)  we 
are  searching  for  constitutes  a  very  small  fraction  of  the  total  search  area,  the  targets  size  is  only 
several  pixels.  The  term  “background”  is  used  to  refer  to  all  non  target  pixels  of  a  scene.  Usually, 
targets  are  man-made  objects  with  spectra  that  differ  from  the  spectra  of  natural  background  pixels. 

2.2.2  General  framework  for  target  detection  algorithms 

Most  hyperspectral  data  processing  techniques  assume  that  each  observation  can  be  modeled  as 
random  vector  whose  dimension  is  the  number  of  spectral  bands  and  which  originates  from  specific 
probability  distribution,  p(x).  Given  an  observed  spectrum  x,  the  likelihood  ratio  (LR)  is  given  by  the 
ratio  of  the  conditional  probability  density  functions: 

(2.1)  p(x\siSnaI  present) 

pyx\signal  absent ) 

If  LR(x)  is  larger  than  the  threshold  //,  the  “signal  present”  hypothesis  is  accepted;  otherwise  the 
“signal  absent”  hypothesis  is  accepted: 


(2.2) 


H i 

Ho 


2.2.3  Matched  filter  algorithms 

The  expression  for  the  Matched  Filter  detector  is  derived  here  from  the  likelihood  ratio  presented  in 
section  2.2.2.  In  order  to  build  a  likelihood  ratio  detector  there  is  a  need  to  define  the  statistical  model 
of  the  data,  namely  to  define  the  probability  distribution  functions.  Normal  probability  based  models 
are  simple  and  often  lead  to  good  performance. 
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The  multivariate  normal  distribution  of  random  vector  x  is  denoted  by  x  ~  N  (p,<D);  the  mean  vector  p 
and  the  covariance  matrix  ®  are  given  in  the  following: 


(2.3) 


ju  =  E^x\ 


®  =  E 


(x-//)(x-//)7 


where  E  [•]  denotes  the 
The  probability  density 

(2.4) 


expectation  operator. 


function  ( PDF)  of  x  is  given  by  (normal  distribution  function): 


p(x)  = 


N/2  il/2 


/-  \i\/z  I  , 

\2n)  I® 


exp 


4(x-^®'(x^) 
V  ^ 


where  |®|  is  the  determinant  of  the  covariance  matrix  ®. 

For  target  and  background  spectra  modeled  as  multivariate  normal  vectors,  the  detection  problem  can 
be  specified  by  the  following  hypotheses: 


~ 

of  target,  H t  assumes  presence  of  target;  b  and  5 
denote  the  background  and  target  parameters,  respectively. 

The  natural  logarithm  of  the  LR  given  in  (2.1)  becomes: 

(2.6)  R(x)  =  ^(x-pb)T  ^(x-p^-^x-uY 

This  expression  in  fact  compares  the  Mahalanobis  distances  of  the  observed  spectrum  from  the  centers 
of  the  two  classes. 


(2.5) 


Ho  :  x 
Hx  :  x 


where  H0  is  the  null  hypothesis  assuming  absence 


Page  18  of  102 


Project  Report  -  "Spatial  and  temporal  point  tracking  in  real  hyper  spectral  images" 


Benjamin  Aminov,  Ofir  Nichtern,  Stanley  Rotman 

If  the  target  and  background  classes  have  the  same  covariance  matrix,  that  is,  ®t  =  ®b  =  ®,  the  LR 
detector  becomes: 


R(x)  =  (jut-JubY  ®~1x  =  cTx 


This  detector  is  known  as  the  Fisher’s  linear  discriminant  and  is  very  useful  in  pattern  recognition 
applications.  In  the  communications  and  signal  processing  areas,  as  well  as  hyperspectral  imagery 
processing,  the  term  Matched  Filter  ( MF)  is  used  as  reference  to  any  filter  of  the  form: 


(2.8) 

where  A:  is  a  normalization  constant. 


R(X)  CMFX 

CMF  =  (  Ar  —  Fb) 


In  practical  applications,  the  mean  vectors  and  the  covariance  matrix,  required  for  the  MF  calculation, 
are  unavailable  and  have  to  be  estimated  from  the  available  data.  Under  the  assumption  of  low 
probability  targets,  given  N  vectors  x(n),  n= 1,2,  . . .,  N,  the  maximum  likelihood  estimates  for  the  mean 
and  covariance  matrix  of  the  background  are  given  in  the  following: 

(2-9) 

6  =  — X[x(n)-/}][x(n)-/)]r*6, 

iV  n= 1 

As  for  the  target,  usually  there  is  not  enough  training  data  to  determine  its  mean  and  covariance.  Thus 
the  target  spectral  signature  5  is  often  taken  from  a  spectral  library,  or  it  is  chosen  to  be  the  mean  of  a 
small  number  of  known  target  pixels  observed  under  the  same  conditions. 

The  resulting  adaptive  matched  filter  ( AMF)  is  given  by: 

(2.10) 

where  the  data  cube  mean  is  normally  removed  from  the  target  and  test  pixel  spectra. 
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2.2.4  Anomaly  detection  algorithms 

Anomaly  detection  is  based  on  seeking  out  the  pixels  exhibiting  strong  spectral  differences  from  the 
surrounding  background  [7] . 

There  are  situations,  besides  the  trivial  case  of  unknown  target  signature,  in  which  the  anomaly 
detection  is  advantageous  to  matched  detection  methods.  Detection  algorithms  that  presume  a  target 
signature  are  subject  to  signal  mismatch  losses  because  of  the  complications  of  converting  sensor  data 
into  material  spectra  -  in  order  to  apply  the  known-signal  detection  algorithm,  the  reflectance  or 
emissive  spectra  of  the  objects  of  interest  must  be  converted  into  radiance  at  the  sensor,  or  the  radiance 
data  collected  by  the  sensor  must  be  converted  into  reflectance  or  emission,  this  process  is  known  as 
atmospheric  calibration  and  was  described  in  section  2. 1.2.1.  In  this  process,  errors  in  the  estimate  of 
reflective  spectra  may  occur,  which  will  lead  to  signal  processing  mismatch  losses.  Target  matching 
approaches  are  further  complicated  by  the  large  number  of  possible  objects  of  interest  and  the 
uncertainty  as  to  the  reflectance  or  emission  spectra  of  these  objects.  For  example,  the  surface  of  the 
object  in  interest  may  consist  of  several  materials,  and  the  spectra  may  be  effected  by  weathering  or 
other  unknown  factors. 

Thus,  the  multiplicity  of  possible  spectra  associated  with  the  objects  of  interest  and  complications  of 
atmospheric  compensation  have  lead  to  the  development  of  anomaly  detectors  that  seek  to  distinguish 
observations  of  unusual  materials  from  typical  background  materials  without  reference  to  target 
signatures  or  target  subspaces. 

Anomalies  are  defined  with  reference  to  a  model  of  the  background.  Background  models  are 
developed  adaptively  using  reference  data  from  either  a  local  neighborhood  of  the  examined  pixel  or  a 
larger  section  of  the  image.  Local  anomalies  are  defined  as  observations  that  deviate  in  some  way  from 
the  estimated  background. 

2.2.4.1  The  RX  Algorithm 

The  RX  algorithm  is  the  most  common  representative  of  anomaly  detectors.  It  is  based  on  the 
Generalized  Likelihood  Ratio  Test  ( GLRT)  presented  in  section  2.2.2,  assuming  normal  probability 
distribution  of  the  data. 
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In  order  to  calculate  the  GLRT,  the  PDF  of  the  data  needs  to  be  estimated.  Here  the  PDF  is  assumed  to 
have  a  parametric  form,  thus  the  PDF's  parameter  9  can  be  estimated  from  reference  data.  The 
assumption  is  that  the  reference  data  set  consists  of  N  independent  identically  distributed  (HD) 

samples  of  dimension  K  (the  model  allows  N=0)  denoted  by  jy  e  CK  |l  <  j  <  jV}  ,  the  PDF  of  the 
reference  data  set  is  denoted  by  p0  ( y,  0O ) .  The  test  data  set  is  denoted  byjxj  eCK  |l  <  j  <  M j .  The 

data  set  is  to  be  classified  as  arising  from  either  PDF  a(t>^i)(-^i)  or  p0  (y,#0)(//0)  .  The  GLRT  is 
given  by: 


(2.11) 


G(x) 


max  (a  ({x;},6>)  p0  ({v,  |l  <  j  <  iv} ,  6> )) 

max  (a  ({*,},  )Po  ({v,  M  J  ^  M  ’  0o )) 


Hi 

> 


< 

Ho 


The  data  is  modeled  as  following: 


(2.12) 


H0:x~  N(p,Q>) 
H, :  x  ~  N(s,0) 


where  N  (p, O)  denotes  normal  distribution  with  mean  p  and  covariance  <h.  s  and  are  unknown 
parameters. 


The  GLRT  for  this  model  is  given  by: 

RX  (x]  =  (x-  u\  — — — Oh - - — (x-  u)(x-  uY  ( x-u )  ^  rt 

W  V  ’  \N  +  l  N  +  V  A  ’  J  V  ’  < 


^  i  1 


V1 


Hi 

> 


(2.13) 


Hr\ 


where  O  is  the  covariance  matrix  estimated  from  the  reference  data. 


As  iV— ►  oo,  RX  converges  to: 


(2.14)  RX(x)  =  (x-p)r  <b~l(x-p)  >  ti 

H0 

RX  is  a  single  pixel  form  of  the  ReedXiaoli  algorithm  that  is  often  approximated  by  equation  (2.14). 
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This  expression  for  the  RX  algorithm  is  closely  related  to  the  expression  given  in  equation  (2.10)  for 
the  Matched  Filter  detector.  The  denominator  in  (2.10)  can  be  ignored,  since  it  is  a  normalizing 
constant  term.  The  RX  now  resembles  the  Matched  Filter,  but  instead  of  matching  to  the  target 
signature  (which  is  unknown  for  the  RX),  the  match  is  performed  on  the  (x  -  p)  term.  The  intuitive 
explanation  for  this  is  that  the  best  estimation  of  the  actual  pixel’s  content  is  its  value  subtracted  by  the 
background.  If  a  target  is  present  then  the  residue  will  be  stronger  and  the  overall  response  of  the  filter 
will  be  higher.  If  the  target  is  absent,  the  residue  after  the  background  subtraction  should  be  close  to 
zero  (depending  on  the  noise  level)  and  the  filter  output  will  be  low  indicating  less  likeliness  of  target 
presence. 

It  is  important  to  note,  that  the  distribution  of  the  test  statistic  for  Ho  is  independent  of  the  unknown 
parameters,  and  thus  the  test  statistic  has  the  constant  false  alarm  rate  ( CFAR )  property. 

2.2.5  Evaluation  of  detection  algorithms 

As  stated  in  the  previous  subsection  the  detection  is  based  on  determining  appropriate  threshold 
against  which  the  LR  is  compared  [5]  .  A  practical  question  of  great  importance  to  a  detection 
algorithm  user  is  where  to  set  the  threshold  to  keep  the  number  of  detection  errors  (target  misses  and 
false  alarms)  small.  There  is  always  a  compromise  between  choosing  a  low  threshold  to  increase  the 
probability  of  detection  Pd  and  a  high  threshold  to  keep  the  probability  of  false  alarm  PFA  low.  For  any 
given  detector,  the  trade-off  between  Pd  and  Pfa  is  described  by  the  receiver  operating  characteristic 
( ROC)  curves,  which  plot  Pd(v)  versus  Pfa(v) as  a  function  of  threshold  -co<//<co.  The  calculation  of 
the  ROC  curves  or  the  threshold  requires  specifying  the  distribution  of  the  observed  spectra  x  under 
each  of  the  two  hypotheses  specified  for  the  LR.  In  most  practical  situations,  these  conditional 
probability  densities  depend  on  some  unknown  target  and  background  parameters  (composite 
hypotheses).  Therefore,  the  ROC  curves  of  any  detector  depend  on  the  unknown  parameters.  In  this 
case,  it  is  almost  impossible  to  find  a  detector  whose  ROC  curves  remain  an  upper  bound  for  the  whole 
range  of  the  unknown  parameters  (uniformly  most  powerful  ( UMP )  detector). 

Practical  target  detection  systems  should  function  automatically,  that  is,  without  operator  intervention. 
This  requires  an  automatic  strategy  to  set  a  “proper”  detection  threshold.  In  certain  applications,  for 
example  military  target  detection,  a  high  false  alarm  rate  is  very  undesirable.  Therefore  it  is  critical  to 
keep  the  false  alarm  rate  constant  at  the  maximal  affordable  level  by  using  a  constant  false  alarm  rate 
{CFAR)  processor.  The  task  of  a  CFAR  algorithm  is  to  provide  detection  thresholds  that  are  relatively 
immune  to  noise  and  background  variation  and  allow  target  detection  with  a  constant  false  alarm  rate. 
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In  addition,  the  performance  evaluation  of  detection  algorithms  in  practice  is  challenging  due  to  the 
limitations  imposed  by  the  limited  amount  of  target  data.  As  a  result  the  establishment  of  accurate 
ROC  curves  is  quite  difficult.  A  practical  mean  for  comparing  various  detection  algorithms  is  their 
ability  to  operate  in  CFAR  mode  and  the  enhancement  of  the  separation  between  targets  and 
background  they  provide.  The  CFAR  property  depends  on  the  capability  to  accurately  model  the 
detection  statistics  of  the  background  pixels  for  a  given  algorithm. 
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3  Point  target  detection  and  tracking  in  consecutive  frame  in  IR 
imagery 

3. 1  Overview 

3.1.1  Problem  statement 

In  order  to  develop  advanced  staring  IR  technologies  in  surveillance  applications,  there  is  a  need  for 
algorithms  to  detect  weak  point  target  moving  at  pixel/sub-pixel  per  frame  velocities.  These 
algorithms  must  handle  scenes  containing  both  clutter  and  noise  dominated  backgrounds.  Clutter 
dominated  background  is  a  background  in  which  the  largest  non-target  detections  (false  alarms)  are 
from  the  structured  background.  In  the  context  of  the  temporal  processing,  evolving  clouds  are  the 
most  severe  cause  of  such  leakage.  In  noise  dominated  background  random  noise  provides  the  largest 
non-target  detections.  Scenes  with  only  blue  sky,  night  scenes,  or  scenes  with  temporally  stationary 
clutter  are  well  modeled  by  the  white  Gaussian  noise  in  the  temporal  dimension  and  hence  are  noise- 
dominated  backgrounds  in  the  context  of  temporal  processing. 

An  efficient  algorithm  must  thus  address  these  dual  and  often  conflicting  aspects  of  the  problem:  the 
need  for  effective  clutter  suppression  for  minimizing  false  alarm  rates  in  structured  backgrounds,  and 
the  need  for  good  signal-to-noise  sensitivity  for  maximum  range  of  weak  target  detection.  A  possible 
solution  to  the  problem  is  to  use  a  dynamic  programming  algorithm  ( DPA )  which  gives  scores  to 
pixels  according  to  their  likelihood  of  being  a  target  and  accumulates  scores  from  frame  to  frame  for 
all  the  pixels.  Using  that  algorithm  enables  the  tracking  of  low  SNR  targets.  It  is  also  advantageous  to 
use  a  preprocessing  stage,  which  whitens  the  background,  lowers  the  clutter  and  emphasized  the  target 
to  achieve  a  lower  false  alarm  rate. 

3.1.2  Application 

Point  target  detection  is  particularly  useful  for  staring  IR  technologies,  e.g.,  IR  search  and  track 
( IRST ),  such  as  the  ones  used  in  surveillance  application  or  missile  detection  and  tracking.  These  are 
applications  in  which  the  sensor,  in  this  case  the  camera,  is  fixed  on  the  ground  and  pointed  towards 
the  sky.  The  camera  is  not  moving;  the  temporal  variation  of  the  signal  originates  from  clutter  (clouds) 
and  target  movement  and,  of  course,  noise. 
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3.2  Algorithms  for  moving  point  target  detection  in  IR  imagery 
3.2.1  The  track  before  detect  approach 

Track  before  detect  ( TBD )  is  a  technique  for  target  detection  in  which  the  detection  is  not  declared  at 
each  frame;  instead  a  number  of  frames  of  data  are  processed  after  which  the  estimated  track  is 
returned  when  the  detection  is  declared.  Since  target  trajectories  are  not  known  prior  to  detection,  TBD 
schemes  typically  require  that  multiple  frame  (scan-to-scan)  processing  be  performed  against  multiple, 
simultaneous  target  trajectory  hypotheses.  Detections  are  declared  for  those  hypotheses  for  which  the 
integrated  energy  exceeds  appropriate  threshold  levels. 


3.2.2  Dynamic  Programming  Algorithm  (DPA) 

The  DPA  is  an  implementation  of  the  TBD  approach  that  aids  in  the  tracking  of  a  dim  point  target 
maneuvering  in  a  noisy  background,  [17]  ,[18]  ,  [19]  .  It  is  based  on  the  assumption  that  a  real  target 
moves  in  a  trajectory  while  noise  appears  and  disappears  in  a  random  fashion.  In  every  frame,  each 
pixel  is  considered  as  a  potential  target  and  is  checked  for  its  origin  in  the  previous  frame.  By  doing 
so,  all  possible  trajectories  with  high  probability  are  built  up,  and  it  is  assumed  that  with  the 
progression  in  the  image  sequence  and  the  addition  of  data,  the  probability  of  a  single  trajectory  will 
grow  above  the  others.  This  trajectory  will  be  the  target's  path. 

The  algorithm  uses  the  following  principles  as  guidelines: 

1.  Overcoming  low  SNR  using  prior  knowledge  of  the  way  the  target  moves;  an  apparent 
unlikely  change  of  target  speed  and  location  will  lower  the  probability  of  the  target  to  be 
part  of  the  trajectory  we  would  like  to  find. 

2.  Processing  is  done  not  on  a  single  frame  but  rather  on  a  given  sequence  so  that  the  nth 
frame  will  contain  accumulated  information  from  the  previous  frames. 

3.  Priority  is  given  to  the  detection  of  the  target's  trajectory. 

The  DPA  algorithm  uses  a  Markov  Process  Model  in  order  to  give  scores  to  the  image  sequence; 
when  the  current  frame  is  processed,  all  the  vital  information  from  the  preceding  frames  lies  in  the 
accumulated  score  matrix  ( ASM)  of  the  previous  frame  (state).  At  the  end  of  the  DPA  stage,  the  target 
is  acquired  from  the  last  frame,  and  its  trajectory  is  found. 
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3.2.3  Temporal  processing 

Temporal  processing  exploits  the  change  in  a  pixel's  intensity  as  moving  point  target  traverses  it.  The 
temporal  profile  of  target  affected  pixel  raises  and  falls  as  the  target  enters  and  exits  the  pixel,  while 
clutter  affected  temporal  profiles  show  more  monotonic  intensity  changes  over  an  equivalent  time 
period,  (Figure  3). 
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Figure  3:  Example  of  clutter  and  target  temporal  profiles. 

References  [8]  and  [9]  introduce  a  zero-mean  damped  sinusoid  filter.  The  Triple  Temporal  Filter 
( TTF )  has  been  designed  to  have  a  strong  response  to  relatively  narrow  peaks  and  a  weak  response  to 
monotonic  changes.  The  filter  is  based  on  applying  three  sinusoid  filters  serially  -  a  sequence  of  two 
zero-mean  damped  sinusoids  followed  by  an  exponential  averaging  filter  along  with  an  edge 
suppression  factor.  The  implementation  of  the  TTF  is  recursive;  the  output  result  is  updated  with  each 
new  frame.  Figure  4  shows  the  impulse  response  of  a  zero-mean  damped  sinusoid  filter. 


Frame  delay 

Figure  4:  Impulse  response  of  zero-mean  damped  sinusoid  temporal  filter. 
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Each  filter  is  recursively  implemented  using  the  following  expression: 

(3.1)  Z*.,  = -Z*  •«•**+ <4.,  •e* 

where  a  is  the  damping  factor,  6  and  (j>  are  the  angle  and  initial  phase  of  the  sinusoid  respectively  and 
dk+i  is  the  current  sample  examined. 

The  TTF  is  specified  by  six  parameters:  the  periods  and  the  damping  constants  of  the  first  two 
sinusoids,  the  damping  constant  of  the  averaging  filter  and  an  edge  parameter  used  in  the  edge 
suppression  spatial  adjunct.  The  parameters  can  be  derived  by  applying  the  simplex  algorithm  to  sets 
of  real  data,  reference  [10]  describes  this  procedure. 

The  filter  parameters  define  its  clutter  suppression  ability  on  one  hand,  and  SNR  sensitivity  on  the 
other.  Thus,  a  certain  set  of  parameters  results  in  a  clutter  suppressive  filter  (Cloud  Filter  -  CF),  while 
a  different  set  of  parameters  results  in  a  filter  which  performs  better  in  noise-dominated  scenes  (Noise 
Filter  -  NF). 

Reference  [9]  presents  a  fusion  filter  (FF)  which  incorporates  the  output  of  two  double  temporal 
filters,  one  designed  for  clutter  suppression  and  the  other  for  low  SNR  operation,  thus  providing  a 
filter  capable  of  dealing  with  both  clutter  and  noise  dominated  scenes. 

The  FF  extends  the  range  of  velocities  over  which  weak  targets  can  be  extracted  from  temporal  noise 
as  well  as  above  the  clutter  leakage.  The  form  of  the  FF  is  two  double  temporal  filters  running  in 
parallel.  One  channel  of  the  FF  is  derived  from  the  CF  (unmodified)  and  the  second  channel  is  a 
modified  version  of  the  NF.  The  modified  NF  enables  the  detection  of  slow  weak  targets  in  clear  sky 
and  still  control  clutter  leakage.  The  final  FF  algorithm  result  is  taken  as  a  maximum  of  the  two 
channels  for  each  pixel  followed  by  an  eight  frame  average.  The  FF  is  illustrated  schematically  below. 
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Figure  5:  Fusion  Filter  configuration. 
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3.2.3. 1  Variance  filter 

Another  type  of  temporal  processing  filter  is  the  variance  filter  (VF)  presented  in  [1 1]  .  The  noise  filter 
based  on  damped  sinusoids  can  detect  targets  roughly  as  weak  as  S/N  =  3,  but  only  over  a  narrow 


range  of  focal  plane  velocities:  0.05  to  0.1  [ppf\.  Sensitivity  falls  off  rapidly  for  velocities  slower  or 


faster  than  this.  The  variance  filter  ,VF,  can  detect  targets  this  weak,  or  weaker  in  some  cases,  over  the 
extended  range:  0.02  to  0.25  [ppf).  The  algorithm  involves  simple  recursive  steps  easily  implemented 
in  hardware  which  provides  a  recursively  updated  estimate  of  the  following  quantity  for  each  pixel: 
(temporal  variance)  -  (long  time  baseline  variance).  The  quantity  is  zeroed  unless  positive.  The 
variance  filter  estimates  the  change  in  temporal  variance  by  calculating  the  short-term  variance  and 
subtracting  the  estimated  long-term  variance. 

The  baseline  algorithm  consists  of  four  steps  per  pixel  done  in  sequence  and  recursively  updated  with 
each  new  temporal  frame  of  data.  The  four  steps  correspond  to  the  four  equations  below;  the  steps 
involve  exponentially  weighted  averages  of  16  or  32  frames  (nominally). 

■  Step  1  updates  the  32  frame  average  M32  of  the  input  data  for  each  pixel. 

■  Step  2  updates  the  16  frame  average  estimate  of  the  variance  defined  as  (input-M32)  . 

■  Step  3  generates  a  change  in  variance  (zeroed  unless  positive)  by  subtracting  from  Step  2,  a 
baseline  estimate  of  the  pixel  variance  given  by  the  32  frame  average  of  Step  2. 

■  Step  4  updates  the  16  frame  average  of  Step  3  providing  the  displayed  value  of  the  variance 
filter  ( VF). 

■  R1-R4  in  the  next  equations  represents  the  recursive  output  of  these  steps  for  each  pixel  with 
each  new  input  frame.  The  brackets  with  subscripts  refer  to  the  recursive  frame  average 
computed  as  explained  subsequently. 


1.  R{  =  (input)  n 


(3.2) 


3.  R3  =  R2 -(R2}22  if>0,  else  0 

4- 
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The  exponentially  weighted  average  for  arbitrary  vector  of  samples  x  is  given  in  the  following: 


(3.3) 


isf— 1 

1  j=o\  l  J 


x[k-j] 


where  k  is  the  kth  input  sample  (which  is  the  most  recent  sample),  i  is  the  number  of  previous  samples 
participating  in  the  average  calculation  (including  the  most  recent  sample). 


The  VF  responds  to  moving  targets  of  positive  or  negative  contrast.  Typical  targets  of  interest  are 
positive  contrast.  For  such  cases,  a  preprocessing  technique  based  on  morphological  image  processing 
but  implemented  by  median  filters  is  applied  and  further  improves  the  SNR  performance  of  the 
algorithm. 


3.2.3.2  Performance  metrics 


There  is  a  need  for  defining  proprietary  performance  metrics  for  point  target  detection  in  consecutive 
frame  IR  imagery  based  on  temporal  processing.  This  task  becomes  more  challenging  when  working 
with  real  world  data.  Standard  metrics  are  based  on  some  variations  of  the  following  equation  for  gains 


in  dB  [12]  : 
(3.4) 


Gain  =  20  log 


(5/cL 

(s/cl 


where  S  and  C  are  target  and  clutter  measures  respectively,  in  and  out  denote  the  data  at  the  input  and 
at  the  output  of  the  relevant  algorithm  respectively.  Usually  the  values  of  S  are  predetermined  since  the 
target  is  often  embedded  into  the  sequence.  The  values  of  C  can  be  taken  either  as  the  standard 
deviations  or  the  variances  of  a  single  frame. 

When  working  with  real  world  data  several  issues  arise:  the  signal  strength  for  targets  with  no 
auxiliary  ground  truth  is  not  known  and  must  be  extracted  from  the  data;  the  clutter  probability  density 
function  is  rarely  Gaussian,  because  of  that  the  clutter  is  not  accurately  characterized  by  standard 
statistical  measures;  the  measure  of  the  initial  clutter  severity  should  relate  to  its  temporal  non- 
stationarity  since  stationary  clutter  can  be  removed  by  well-designed  temporal  processing. 
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Reference  [12]  suggests  the  following  desired  features  for  performance  metric  of  temporal  algorithm: 

1 .  Have  some  link  to  previous  practice  (for  example  still  employ  equation  (3.4)). 

2.  Be  relevant  to  false  alarm  threats. 

3.  Capable  of  operating  on  real  targets  without  auxiliary  ground  truth. 

4.  Be  adequate  to  temporal  processing. 

5.  Not  be  unduly  influenced  by  simple  processing  steps  like  mean  or  local  mean  removal. 

6.  Be  symmetric  in  its  treatment  of  the  input  and  fdtered  output. 

The  reference  suggests  two  metrics  based  on  equation  (3.4).  The  first  metric  is  called  the  Variance 
Metric  ( VM ).  The  ratio  ( S/C) out  is  taken  as  the  ratio  between  the  largest  target  related  pixel  output  to 
the  largest  non-target  output  from  the  filtered  output  frame/s.  The  symmetric  use  of  the  corresponding 
largest  target  and  non-target  input  values  from  the  unprocessed  frames  in  the  (S/C)in  calculation  would 
be  inconsistent  with  the  features  4  and  5  defined  above  and  does  not  address  the  target  in  terms  of  its 
surroundings.  The  Cin  is  taken  as  the  maximal  value  of  the  temporal  standard  deviation  calculated  for 
each  of  the  non-target  pixels.  The  approach  for  determining  Sin  is  not  as  straightforward  since  it  is 
difficult  to  determine  the  intensity  of  a  real  target  when  it  moves  slowly  through  sampled  imagery. 
Thus  the  input  target  strength  is  characterized  as  the  average  of  the  maximal  values  of  the  target  pixels 
after  subtracting  the  local  background  mean  estimated  from  non-target  pixels. 

The  second  metric  presented  in  source  [12]  is  the  Anti-median  Metric  {AM).  An  anti-median  filter  is 
applied  on  the  input  and  output  frame/frames.  The  filter  takes  an  absolute  difference  between  the 
center  pixel  and  the  median  of  5x5  block  about  and  including  this  pixel.  The  ratios  (S/C) out  and  (S/C)in 
are  taken  from  the  largest  target  value  and  the  largest  non-target  value  from  the  output  and  input  frame 
respectively.  If  there  are  more  than  one  frame,  averaging  over  the  number  of  frames  is  performed  for 
the  largest  values. 

None  of  the  two  metrics  presented  satisfies  the  desired  demands  listed  above.  The  complicated 
dynamics  of  moving  targets  and  evolving  clutter  recorded  on  a  staring  array  is  hardly  captured  by  a 
single  measure. 
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In  order  to  evaluate  the  algorithm  a  new  metric  has  been  defined.  Each  frame  in  the  sequence  is 
divided  into  blocks,  the  size  of  HxN  (30x30  were  used). The  algorithm  is  run  over  9  blocks  -  Target 
block  and  eight  adjacent  blocks.  The  SNR  of  the  target  block  (TB)  and  its  eight  adjacent  non-target 
blocks  ( NTB )  are  calculated.  Afterwards,  an  algorithm  score  is  calculated  based  on  the  resulting  SNR's. 


The  block  SNR  is  given  as: 
(3.5) 


Block  _SNR(i,j) 


where  v,t/-  is  the  set  of  pixels  belonging  to  the  (i,j)th  block,  M  is  a  set  containing  the  five  pixels  with  the 
highest  gray  level  in  that  block,  and  a  is  the  standard  deviation  of  the  block  pixels.  The  formula 
performs  a  subtraction  between  the  expectation  value  of  the  highest  pixels  (target)  and  the  expectation 
value  of  the  rest  of  the  pixels  (background),  divided  by  the  standard  deviation  of  block  pixels.  Since 
the  probability  matrices  introduce  the  influence  of  target  pixels  on  adjacent  pixels,  these  influenced 
pixels  might  accumulate  higher  values  than  uneffected  pixels  (background),  and  can  be  regarded  as 
target  pixels.  This  might  lower  the  expectation  value  of  the  target,  but  will  lower  the  standard 
deviation  of  the  background,  since  these  high  pixels  are  higher  than  the  statistics  of  the  background,  to 
a  more  accurate  one. 


The  algorithm  score  is  given  as: 

Block SNR(i,j\IJhn-E[{BIock SNR(i,  j)}( 


(3.6) 


Score  =  ■ 


7{Block_SNR(i,j)} 


(i.j)=NTB 


The  score  is  calculated  by  subtracting  between  the  target  block  SNR  and  the  expectation  value  of  the 
SNR  of  the  non-target  blocks,  divided  by  the  standard  deviation  of  the  non-target  blocks  SNR.  The 
algorithm  always  identifies  a  target  in  each  given  block.  Since  that  is  the  case,  the  false  detection  of 
targets  should  achieve  less  SNR  in  these  NTB,  resulting  in  higher  score.  It  should  be  noted  that  the 
division  to  blocks  is  only  for  the  purpose  of  algorithm  evaluation  (score),  where  in  real  scenarios,  the 
frames  will  not  be  divided.  In  these  cases  the  algorithm  will  produce  several  high  pixels  -  true  and 
false  targets.  Using  the  NTB' s  SNR  in  the  algorithm  score  simulates  the  non-target  scenarios.  In  the 
tests,  we  will  vary  the  number  of  highest  pixels,  defined  as  Max_Numbers,  which  enter  the  target 
metric  in  this  way.  We  will  check  the  influence  of  the  number  of  high  pixels  regarded  as  ‘target 
pixels’  on  the  Block_SNR. 
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4  A  system  for  point  target  tracking  in  consecutive  frame 
hyperspectral  imagery 

4. 1  General  Architecture 

The  system  performs  target  detection  and  tracking  in  three  steps.  The  first  step  is  to  transform  the  three 
dimensional  hypercube  into  a  two  dimensional  image  using  one  of  the  well  known  hyperspectral 
reduction  algorithms.  The  second  stage  involves  a  temporal  separation  of  the  images  sequence  into 
sub-  images  sequence  and  the  usage  of  a  variance  filter  (temporal  processing)  to  detect  the  presence  of 
targets  from  the  temporal  profile  of  each  pixel  groups  (implementation  of  a  variance  filter  on  the  sub¬ 
temporal  profiles  promise  a  pixel  velocity  at  new  sequences),  while  suppressing  clutter  specific 
influences.  The  third  step  is  target  detection  and  tracking  based  on  the  TBD  approach  and  implemented 
using  DPA  that  proved  to  be  an  effective  algorithm  for  the  tracking  of  moving  targets  with  low  SNR. 
The  general  system  architecture  is  given  in  Figure  6. 
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Figure  6:  General  system  architecture. 
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4.2  Hyperspectral  reduction  algorithm 

Three  different  reduction  tests  were  applied  on  each  temporal  hypercube  individually.  Each  of  these 
methods  is  characterized  by  a  mathematical  operator,  which  is  calculated  on  each  pixel.  In  every 
frame,  a  map  of  pixel  scores  is  received  (the  result  of  the  mathematical  operator)  and  used  to  create  the 
movie. 

4.2.1  Test  1  -  Spectral  Average 

Implementation  of  a  simple  spectral  average  of  each  pixel: 

(4-l)  = 

n 

where  x  denotes  a  pixel's  spectrum,  x„  the  spectral  value  of  the  nth  band,  and  N  is  the  number  of 
spectral  bands. 

4.2.2  Test  2  -  Scalar  Product 

Implies  a  simple  scalar  product  of  the  pixel's  spectrum  (after  mean  background  subtraction)  with  the 
known  target  spectral  signature: 

(4.2)  Scalar  Product  =  tT  -(x-m) 

4.2.3  Test  3  -  Matched  Filter 

The  model  assumed  is  the  linear  mixture  model  (LMM)  and  a  known  target  signature.  The  Matched 
Filter  detector  given  in  equation  (2.10)  is  rewritten  as  follows: 

(4.3)  MF  =  tT<&-\x-m) 

where  x  is  the  pixel  being  examined,  t  is  the  known  target  signature,  and  m  and  4)  are  the  background 
and  covariance  matrix  estimations,  respectively.  Compared  to  (2.10),  the  expression  given  in  (4.3) 
ignores  the  constant  denominator  and  explicitly  states  the  background  subtraction  procedure  prior  to 
applying  the  filter. 

The  background  estimation  is  performed  on  the  closest  neighbors,  which  definitely  do  not  contain 
target;  for  example,  if  the  target  is  known  to  be  at  most  two  pixels  wide,  the  background  is  estimated 
from  the  16  surrounding  neighbors  as  illustrated  in  Figure  7. 
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Figure  7:  Pixels  used  for  background  estimation  for  2x2  pixels  target. 


Each  of  the  tests  mentioned  above  were  run  with  a  different  target  factor  (intensity).  This  intensity  can 
be  controlled  manually  by  the  hyperspectral  data  creation  algorithm,  and  it  is  an  external  parameter  to 
those  test  run  mentioned  above.  The  higher  the  target  factor  value,  the  stronger  the  intensity  of  the 
implanted  target,  applying  less  difficulty  to  the  detection  and  tracking  algorithm.  The  implantation 
model  of  the  target  is  directly  proportional  to  the  target  factor  (intensity  of  implantation). 

4.3  The  temporal  processing  algorithm 

Overall,  the  input  to  the  first  stage  is  a  hyperspectral  cube;  the  output  of  the  first  stage  is  a  two- 
dimensional  image  obtained  from  the  hypercube.  A  buffering  of  several  such  images  is  needed  in 
order  to  obtain  a  sequence  long  enough  to  perform  temporal  processing.  The  input  for  the  temporal 
processing  algorithm  is  a  temporal  profile  of  a  pixel.  Figure  8  defines  our  terminology  at  this  stage. 


Figure  8:  Definitions  at  sequences  after  hyperspectral  reduction  algorithm. 

The  temporal  processing  algorithm  start  with  a  temporal  separation  of  each  temporal  profile  to  sub¬ 
temporal  profiles  to  achieve  target  moving  at  a  pixel  velocity  from  images  containing  targets  moving 
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at  a  sub-pixel  velocity.  For  example,  the  profile  at  Figure  9  (above)  is  an  input  to  the  temporal 
separation  as  shown  at  Figure  9  (below).  The  number  of  sub  profiles  is  defined  by: 


(4.4) 


,  =  N-G0 
J  g-g0 


where  N  is  the  number  of  frames  at  profile,  Go  is  the  overlap  between  each  of  sub-profiles  and  G  is  the 
sub-profile  length. 


Figure  9:  Profile  with  target  before  (above)  and  after  (below)  temporal  separation1. 


Afterwards,  the  temporal  processing  algorithm  is  applied.  The  temporal  processing  algorithm  is  based 
on  a  comparison  of  an  estimation  of  the  overall  DC  estimation  of  the  sub-profile  (DC  estimation 
calculated  for  the  overall  profile  in  order  to  achieve  best  or  more  accurate  DC  estimation,  the  DC 
estimation  of  the  sub-profile  chosen  by  the  relation  location  at  the  overall  profile)  and  the  single 
highest  fluctuation  which  occurs  within  the  sub-profile.  In  order  to  estimate  the  overall  temporal  DC 
estimation  (baseline  background  estimation)  was  performed  using  a  long  term  window  of  samples. 
The  background  estimation  is  performed  by  calculating  a  linear  fit  by  means  of  least  squares 


1  The  specific  profile  was  chosen  with  target  -  peak  at  frame  10;  the  profile  length  is  95  frames;  the  sub-profile  length  is  10 
frames  and  overlap  is  equal  to  5  frames. 


Page  35  of  102 


Project  Report  -  "Spatial  and  temporal  point  tracking  in  real  hyper  spectral  images" 


Benjamin  Aminov ,  Ofir  Nichtern,  Stanley  Rotman 


estimation  ( LSE)  [13]  .  The  fluctuation  or  short-term  variance  estimation  is  performed  on  a  shorter 
term  window  of  samples,  after  removing  the  estimated  baseline  background.  The  algorithm  is 
presented  in  the  following  two  steps,  although  in  practice  the  processing  can  be  performed  in  real  time 
using  a  finite  size  buffer: 

4.3. 1.1  Background  estimation  using  a  linear  fit  model 

The  background  can  be  referred  as  the  DC  level  of  the  temporal  profile:  the  DC  level  is  constant  for 
noise  dominated  temporal  profiles  but  time  varying  when  a  clutter  is  present.  The  DC  is  estimated  in  a 
piecewise  fashion  by  using  a  long  term  sliding  window  and  performing  estimation  on  each  set  of 
samples  separately.  The  number  of  samples  of  each  long  term  window  is  denoted  by  M.  The  following 
linear  model  is  used  for  estimating  the  DC;  for  the  sake  of  simplicity  the  description  of  the  estimation 
is  relevant  for  a  single  window: 

(4.5)  y  =  ax  +  b  +  n;x  =  [  1  2  ...  Adf 

where  n  is  the  noise,  a  and  b  are  the  coefficients  which  must  be  estimated,  y  is  the  DC  signal.  The  goal 
of  this  step  is  to  estimate  the  long-term  DC  baseline  using  a  least  squares  fit  to  the  linear  model 


represented  by  coefficients  vector  a  b 


T 


(4.5)  can  be  rewritten  as  follows: 

(4.6) 


y  =  X/3  +  n 


b 


1  1 
1  2 


where  p  = 


and  X  = 


a 


1  M 


Using  least  squares  estimation  the  following  equation  for  ft  is  obtained: 


(4.7) 


The  estimated  DC  of  a  single  window  becomes: 


(4.8) 


y  =  xp 
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The  estimated  DC  of  the  complete  signal  is  obtained  after  performing  the  above  calculations  for  each 
window  separately.  Figure  10  shows  two  synthetic  temporal  profiles  (one  with  the  target  implanted 
and  a  second  with  identical  noise  but  without  a  target)  and  their  estimated  DC  signals. 


1200r 


1150 


1100 


1050 


950- 


Example  of  DC  estimation 


20 


1 200  r 


1150 


1100 


Example  of  DC  estimation 


E 

< 


1050 


Jfv 

—  data 

—  data 

1000 

—  estimated  DC 

—  estimated  DC 

40  60 

time 


80 


100 


950- 


20 


40  60 

time 


80 


100 


Figure  10:  DC  estimation  on  a  signal  with  target  and  the  same  signal  without  a  target. 


The  estimated  DC  of  the  sub-profile  is  chosen  by  the  relation  location  at  the  complete  pixel,  Figure  1 1 
are  show  the  DC  estimation  on  a  signal  with  a  target  and  same  signal  without  the  target  at  the  sub- 
profiles  separations  point  of  view. 


Figure  11:  Sub  profile  DC  estimation  on  a  signal  with  target  and  the  same  signal  without  a  target. 


4.3. 1.2  Short  term  variance  estimation 

Step  2  calculates  the  short-term  variance  after  subtracting  the  estimated  long  term  DC  at  each  sub- 
profile.  The  complete  DC  signal  obtained  in  the  previous  step  is  denoted  by  DCj,  where  j  denotes  the 
index  of  sub-profile,  the  number  of  sub-profiles  is  defined  at  (4.4).  The  DCj  is  subtracted  from  the 
temporal  sub-profile  Pf. 

(4.9)  Pj  =Pj-  DCj 
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The  variance  estimation  is  calculated  by  using  a  sliding  short  term  window  and  performing  estimation 
on  each  set  of  samples  separately.  L  denotes  the  number  of  samples  of  each  short  term  window.  The 
short  term  variance  of  each  window  is  estimated  as  follows: 


(4.10) 


For  window  size  of  L  samples,  overlap  of  L0  samples  and  sub-temporal  profile  of  G  samples,  the 
number  of  windows  W  is  given  by: 


(4.11) 


W  = 


G-L, 

L-L0 


+  1 


Finally,  the  maximum  variance  value  of  a  given  temporal  profile  is  given  by: 
(4-12)  &L  =  max,<,<w{CJ,2l 


where  a;2  is  the  estimated  variance  of  the  ith  window. 

An  example  of  the  variance  response  to  the  presence  of  a  target  is  shown  in  Figure  12.  The  assumption 
is  that  the  presence  of  the  target  will  lead  to  a  short-term  variance  increase.  The  DC  subtraction  has  a 
clutter  suppression  effect  since  the  long-term  DC  tracks  the  clutter  influence  on  the  temporal  profile. 
The  graphs  of  Sub  Profile  4  and  5  were  scaled  to  the  range  of  the  pixel's  values  in  the  profile.  The 
scaling  was  done  in  order  to  show  the  range  of  the  variance  estimation  values. 

Finally,  a  likelihood  ratio  based  metric  is  used  to  evaluate  the  final  score  of  each  sub-temporal  profile. 
The  likelihood  ratio  in  this  case  is  given  by: 


(4.13) 


H0:  P  =  n 

//j :  P  =  t  +  n 


LRT  =  ?t 

°"o 
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^  A  9 

where  P  is  the  zero-mean  temporal  profile,  n  is  noise  and  t  is  the  target  signal.  crl  is  the  estimated 
variance  when  assuming  a  target  is  present;  a*  is  the  variance  estimated  assuming  the  absence  of 
target. 


Piofile  example  -  included  target 
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Figure  12:  Example  of  variance  estimation  on  a  synthetic  signal. 
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In  our  model  these  variances  are  estimated  as  follows: 


*2  -  A2 

G\  ^max 


(4.14) 


K 

= — 

°  Kt?  ' 


>2  1 
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where  a]  for  1  <  i  <  K  denotes  the  K  minimal  variance  values  of  each  temporal  profile.  The  value  of 

K  is  chosen  to  be  smaller  than  W,  so  as  not  to  include  values  which  might  be  caused  by  the  presence  of 
the  target. 


In  this  case,  the  final  score  of  each  sub-profile  is  given  by: 


(4.15) 


Score . 


1  K 


K  =  Defined  by  5  highest  or  lowest  values 


The  algorithm  performance  depends  on  a  wise  choice  of  parameters:  the  short-term,  long-term  window 
size  and  sub-profile  length.  The  long-term  window  size  serves  the  baseline  DC  estimation.  Since  the 
pixel  might  be  affected  by  clutter,  the  baseline  DC  is  not  constant  but  rather  changing.  It  is  assumed 
that  the  entrance  of  clutter  will  cause  a  monotonic  rise  or  fall  pattern  in  the  values  of  the  pixel’s 
temporal  profile.  Thus,  the  long-term  window  should  be  long  enough  to  perform  an  accurate 
estimation,  on  one  hand,  and  short  enough  to  closely  track  clutter  influence,  on  the  other.  In  any  case, 
the  long-term  window  should  be  at  least  longer  than  the  target  base  width  to  avoid  suppressing  it  [14] , 
long-term  window  is  calculated  to  the  overall  profile  to  achieve  best  estimation  of  a  DC  level,  the 
long-term  window  of  the  sub-profile  chosen  by  the  relation  location  at  the  overall  profile).  The  short¬ 
term  window  is  used  for  variance  estimation.  It  should  be  matched  (or  reduced)  to  the  target  width 
(which  depends  on  the  target  velocity).  If  the  short-term  window  is  significantly  longer  than  the  target 
width,  the  variance  change  caused  by  the  target  will  be  diminished.  Sub-profile  length  services  to 
achieve  a  pixel  target  velocity.  It  should  be  matched  (higher)  to  the  target  width.  The  importance  of 
these  three  window  sizes  and  the  overall  window  set  parameters  will  be  further  discussed.  It  is 
important  to  emphasis  that  the  temporal  algorithm  presented  here  doesn’t  assume  a  particular  target 
shape  and  width.  It  does  assume  a  maximum  spatial  size  of  the  target  (affecting  the  target  temporal 
profile)  and  a  positive  adding  of  the  target  to  the  background. 
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4.4  Performance  metric 

As  stated  in  section  3. 2. 3. 2,  defining  a  performance  metric  for  the  temporal  processing  algorithm  for 
both  synthetic  and  real  data  is  a  challenging  task.  In  this  work,  a  spatial  statistic  metric  is  defined  for 
evaluating  performance  on  the  real  data  sequences.  The  metric  divides  the  output  image  into  spatially 
equal  blocks  and  grades  each  block  with  a  statistically  based  score.  This  process  will  be  described  in 
the  following  sections. 


Page  41  of  102 


Project  Report  -  "Spatial  and  temporal  point  tracking  in  real  hyper  spectral  images" 


Benjamin  Aminov,  Ofir  Nichtern,  Stanley  Rotman 

4.5  Dynamic  Programming  Algorithm 

The  algorithm  is  implemented  using  the  following  assumptions: 

1 .  The  target  is  the  size  of  a  pixel  or  less. 

2.  Only  one  target  exists. 

3.  The  target  moves  in  any  possible  direction. 

4.  Target  velocity  is  within  0-2  {pixel/frame ]. 

5.  Images  are  too  noisy  for  using  a  detect  threshold  on  a  single  frame. 

6.  Jitter  is  up  to  0.5  [pixel/frame ]  in  horizontal  and  vertical  directions  only,  uniformly 
distributed. 

Since  the  target  velocity  is  within  the  range  of  0-2  \ppf\  and  a  possible  jitter  of  0.5  [ppj],  the  pixel  can 
move  up  to  2.5  \ppf)  in  the  horizontal  and  vertical  directions,  hence  a  valid  area  from  which  a  pixel 
might  origin  from  in  the  previous  frame  is  a  7x7  matrix.  Such  a  search  area  can  be  resized  according  to 
a  different  velocity  range  and  jitter,  in  the  fashion  described  above.  This  search  area  will  define  the 
probability  matrices  which  contain  the  probabilities  of  pixels  in  the  previous  frames  being  the  origin  of 
the  pixel  in  the  current  frame.  To  take  into  account  the  unreasonable  change  of  direction,  penalty 
matrices  have  been  introduced  which  are  used  to  build  the  probability  matrices  for  the  different 
possible  directions  of  movement.  These  matrices  give  high  probabilities  to  pixels  in  the  estimated 
direction  and  decreasing  probabilities  (punishment)  as  the  direction  vary  from  the  estimated  direction. 

4.5.1  Accumulated  Score  Matrix 

The  Accumulated  Score  Matrix  ( ASM)  is  not  based  on  a  single  frame,  but  rather  collects  information 
for  each  pixel  along  the  sequence.  The  ASM  contains  the  following  information: 

•  The  pixel’s  accumulated  score. 

•  Direction  (angle)  from  which  the  suspected  pixel  has  arrived. 

•  Summation  of  the  multiplications  between  the  accumulated  score  from  the  previous  frame 
with  the  punishment  matrix  in  the  estimated  direction  (will  be  elaborate  later  on). 

•  The  maximum  value  of  the  multiplications  between  the  accumulated  score  from  the 
previous  frame  with  the  punishment  matrix  in  the  estimated  direction  (will  be  elaborated  on 
later). 

•  The  index  of  the  pixel  in  the  previous  frame  from  which  the  pixel  originated,  maintaining 
the  tracks  with  the  highest  probabilities. 
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The  ASM  is  of  size  [N,M, frames,  5\ ,  where  NxM  is  the  frame  size,  frames  is  the  number  of  frames,  and 
5  is  the  number  of  variables  we  save  -  score,  direction,  sum,  max  and  index.  The  sum  and  max  can  be 
discarded  and  are  saved  only  for  purpose  of  later  investigation  of  the  algorithm  performance. 


The  accumulated  score  matrix  consists  of  two  parts  -  the  current  frame  score  and  the  accumulated 
score,  and  is  defined  as: 


(4.16) 


Score  (x,v)  =  a-Oscol  (x,y)  +  b- 
n v  7  n v  ' 


3  3 

g-j  X  Z  w  (^direction).  .-Score  j{i  +  x,j  +  y^> 

{ }  =  -3j  =  -3  1,1  n~  \ 

+  ('-*)■  max  |  w  (  direction .  ■  Score ^  _  fr  +  x,  j  +  >')  j 


=  -3...3,j  =  -3...3 


where: 

Scoren  (x,y)  -  the  accumulated  score  for  trajectory  that  ends  at  the  current  pixel, 

Anti-meann  (x,y)  -  the  score  given  by  the  preprocessing  stage  (anti-mean)  to  the  current  pixel  at  the 
current  frame, 

a  -  the  anti-mean  co-efficient  determining  the  amount  of  influence  on  the  total  score, 
wg  -  an  element  of  the  probability  matrix,  determining  the  probability  of  the  (i,j)  pixel  to  be  the  origin 
pixel  from  the  previous  frame,  based  on  target  speed,  jitter  and  approximated  direction  of  movement, 
b  -  the  memory  persistence  co-efficient,  determining  the  influence  of  the  accumulated  score, 
Yyv  Score n-i  -  the  summation  value  of  the  multiplications  between  the  accumulated  scores  in  the  7x7 
matrix  and  the  probabilities, 

g  -  the  sum  co-efficient  determining  the  amount  of  influence  of  the  summation  on  the  accumulated 
score, 

max(w-Scoren-i)  -  the  maximum  value  of  the  multiplications  between  the  accumulated  scores  in  the 
7x7  matrix  and  the  probabilities, 

(1  -  g)  -  the  sum  co-efficient  determining  the  amount  of  influence  of  the  maximization  on  the 
accumulated  score. 

The  new  accumulated  score  for  each  pixel  is  given  as  a  combination  of  the  score  from  the  whitening 
stage  of  the  current  frame  and  a  function  of  the  accumulated  score  from  the  previous  frame.  To 
calculate  the  function  of  the  accumulated  score,  a  7x7  temporary  matrix  is  created  by  multiplying  the 
probability  matrix  in  the  assumed  direction,  W(direction),  with  the  elements  of  the  ASM  of  the 
previous  frame.  The  function  is  composed  of  two  parts:  the  sum  of  all  the  49  values  of  the  calculated 
temporary  matrix,  and  the  maximum  value  of  the  calculated  temporary  matrix.  Several  parameters 
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have  been  introduced  into  the  formula  to  allow  for  changes  in  the  weight  of  each  part  on  the  calculated 
accumulated  score,  enabling  flexibility  in  the  tuning  of  the  system  to  the  optimal  condition,  if  one 
exists.  A  summary  of  the  parameters  will  be  given  later  on  in  Section  4.5.4. 

4.5.2  The  probability  matrix  W 

This  matrix  contains  the  probabilities  of  the  pixel  under  investigation  (suspected  to  be  a  target),  to 
originate  from  a  certain  location  in  the  previous  frame.  There  are  nine  different  probability  matrices 
for  each  of  the  possible  directions  -  0°  315°  at  jumps  of  45°  (directions  1-8),  and  standing  in  place 

(direction  9).  The  probability  matrix  is  given  as: 


(4.17) 


W  (direction)  =  p-Wbasic 


+  {l~  p)' 


w 'punishment  ( direction)-  Wf ,flme 
X  W punishment  ( direction )  •  Wframe 


where 

W  (direction)  -  the  probability  matrix  calculated  for  the  specific  direction, 

W basic  -  the  basic  probability  matrix  that  includes  the  effects  of  speed  and  jitter, 
p  -  the  Wbasic  co-efficient  determining  the  influence  of  Wbasic  on  W(direction), 

W punishment  (direction)  -  the  punishment  matrix  of  a  given  direction, 

Wframe  ~  the  matrix  used  to  fit  Wpunishment  to  the  set  of  velocities, 

(1-p)  -  the  WpUniShment  (direction)  co-efficient  determining  the  influence  of  WpuniShment  (direction)  on 
W(direction) . 

Figure  13  below  shows  the  7x7  punishment  matrix  Wpunishment  (—*)  (0°).  The  estimated  region  of  arrival 
gets  the  highest  values,  12+24  in  this  case.  The  values  decrease  until  reaching  the  cells  opposite  to  the 
estimated  direction.  As  can  be  seen,  the  punishment  on  change  of  direction  can  be  controlled  by  the 
parameter  y  that  changes  the  ratio  of  punishment.  Setting  y=24  achieves  the  highest  ratio,  whereas  y=0 
achieves  a  very  low  ratio.  By  using  negative  values  for  y  the  ratio  can  be  further  reduced.  There  is  a 
tradeoff  between  the  need  for  punishment  on  change  of  direction  and  the  need  for  adaptation  to 
maneuvering  targets. 
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Figure  13:  7x7  punishment  matrix  in  the  estimated  direction  — 


The  W(direction)  matrix  is  a  superposition  of  the  basic  probability  matrix,  Wimsic,  encompassing  the 
target  and  the  allowed  jitter,  and  the  punishment  matrix,  Wpunishment  (direction),  in  the  assumed 
direction  with  normalized  elements.  Figure  14  below  shows  the  basic  probability  matrix,  the  eight 
punishment  matrices  for  the  different  directions,  and  the  resulting  nine  probability  matrices.  The  basic 
probability  matrix  encompasses  the  target  speed  and  the  allowed  jitter,  and  represents  standing  targets. 
The  punishment  matrices  give  high  probability  to  the  group  of  pixels  in  the  assumed  direction,  and 
'punish'  pixels  as  they  differ  from  that  direction.  In  the  figure  below,  the  target  is  assumed  to  be  at  a 
velocity  range  of  [0..1],  thus  only  a  5x5  matrix  is  needed  (including  the  assumed  jitter)  and  the  border 
pixels  are  zeroes  accordingly  using  Wframe.  The  p  parameter  was  set  to  0.5,  giving  equal  weight  to 
Wbasic  and  Wpunishment(direction) ,  and  the  punishment  parameter  y  was  set  to  24. 
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Figure  14:  (a)  the  7x7  Wh matrix,  after  using  the  frame  matrix  (b)  the  eight  7x7  Wpiln,\hmpnt  matrices  for  the 
different  directions  (c)  the  nine  W(direction)  matrices  for  p  set  to  0.5,  and  velocity  at  range  fQ..ll. 
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4.5.3  Direction  Calculation 

To  use  W(direction),  the  direction  from  which  a  pixel  originated  has  to  be  found.  The  calculation  of 
the  direction  starts  by  creating  a  temporary  7x7  matrix  in  which  the  adjacent  pixels  scores  (including 
the  pixel  under  investigation)  will  be  saved.  Afterwards,  the  direction  of  each  pixel  from  the  previous 
frame  is  found  (the  direction  is  retrieved  from  the  ASM).  Each  pixel  is  given  a  parameter  Var  that  gets 
a  value  according  to  the  pixel's  direction  in  the  previous  frame  and  its  direction  relative  to  the  pixel 
under  investigation  (central  pixel).  The  parameter  gives  a  maximal  value  for  identical  directions  and 
minimal  value  for  opposite  directions.  The  final  score  of  each  pixel  in  the  temporary  matrix  is  defined 
as: 

(4.18)  Source  _  Pixel  _  Score (x,  v)  =  Scoren_x  (x,y)-  w  (direction) -Var 

where  Score„.i(x+i,y+j)  is  the  pixel's  accumulated  score  from  the  previous  frame,  w(i,j)  is  the  value  of 
W(direction)  in  the  (i,j)  coordinate  and  in  a  direction  relative  to  the  pixel  under  investigation,  and  Var 
is  the  direction  difference  parameter  that  was  introduced  above.  The  pixel  with  the  highest  score  is 
chosen  as  the  'source  pixel'.  Since  each  group  of  pixels  in  the  matrix  belongs  to  a  specific  direction,  the 
direction  is  now  known,  and  is  saved  in  the  ASM. 

To  summarize,  to  calculate  the  pixel’s  direction: 

1.  Create  a  7x7  temporary  matrix. 

2.  Find  the  direction  of  the  pixels  in  the  previous  frame. 

3.  Calculate  Var  according  to  the  direction  consistency. 

4.  Calculate  the  temporary  matrix  score,  Source_Pixel_Score(x+i,y+j). 

5.  Find  the  pixel  with  the  highest  score,  and  deduce  the  direction. 

4.5.4  Parameter  Summary 

In  the  previous  sections,  several  parameters  that  control  the  algorithm  functionality  were  introduced. 
These  parameters  are  of  high  significance  since  they  allow  checking  the  influence  of  changes  in  the 
weights  each  part  is  given  on  the  determination  of  the  ASM,  allowing  the  algorithm  to  work  better,  and 
eventually  track  and  detect  targets  in  an  optimal  fashion.  A  summary  of  the  parameters  used  in  the 
algorithm  (a,b,p,g,y)\ 
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a  -  the  whitening  co-efficient  determining  the  amount  of  influence  on  the  total  score, 
b  -  the  memory  persistence  co-efficient,  determining  the  influence  of  the  accumulated  score, 
p  (and  (1-p))  -  the  Wbasic  (Wpunishment(direction))  co-efficient  determining  the  influence  of  the  7x7 
Wbasic  (  Wpunishment(direction ))  matrix  on  W(direction)  probability  matrix, 

g  ( 1-g )  -  the  sum  co-efficient  determining  the  amount  of  influence  of  the  summation  (maximum)  on 
the  accumulated  score, 
y  -  the  punishment  parameter. 

Previous  tests  on  the  algorithm  have  shown  that  optimal  results  occur  for  the  following  parameter 
values:  a= 1,  p= 0.5,  >’=24.  The  next  section  deals  with  the  issue  of  the  memory  coefficient  and 
discusses  the  other  two  parameters,  b  and  g. 
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4.5.5  The  effective  memory  coefficient  (EMC) 

As  can  be  seen  from  the  accumulated  score  formula,  b  determines  the  memory  persistency  or  the 
decaying  influence  of  the  scores  from  the  previously  processed  frames.  To  prevent  the  algorithm  from 
'exploding'  an  effective  memory  coefficient  (EMC)  has  to  be  within  the  range  of  values  [0..1].  The 
EMC  is  introduced  due  to  the  fact  that  b  itself  is  not  the  memory  coefficient,  since  it  is  multiplied  by  g 
or  (1-g).  The  smaller  b  is  the  less  effect  the  previous  frames  have  on  the  current  score  and  the  overall 
score  will  be  lower.  A  high  value  of  b  will  cause  high  memory  persistency  and  the  system  might  suffer 
from  low  adapting  capabilities,  which  is  important  in  cases  of  altering  direction  targets.  Correct  value 
of  EMC  can  make  the  detection  process  much  more  effective  by  getting  rid  of  old  data,  which  is  no 
longer  needed  for  the  updated  decision  making,  thus  providing  faster  detection  capabilities  of  well 
maneuvering  targets. 

In  previous  research  a  preferred  memory  coefficient  was  obtained:  /?=0.8.  The  formula  for  the 
accumulated  score  in  the  DPA-1D  was:  Scoren  =  anti-meann  +  f  •  maxfw  •  Scoren.i}  where 
/M).8 /max{w}  so  that  0  <  f  <  1.  In  the  current  DPA-2D,  the  score  formula  is:  Scoren  =  anti-meann  + 
ffg  ■  sumfw  •  Score,,. /}+( 1-g)  'maxfw  ■  Scoren.i}] . 

Similarly  to  DPA-1D  we  demand:  b  =  f  =  0.8  / [g  •  sumfw}  +  (1-g)  •  maxfw}] .  By  substitution  of  g  = 
0,  g  =  1,  we  get:  fg=o  =  0.8  /  maxfw }  ,  fg=i  =  0.8  /  sumfw}.  After  rearrangement,  we  get  the  following 
formula  for  EMC : 


(4.19) 


b  =  f  = 


g-.?wm{w}  +  (l-g)-max  jw} 

08 


[*•/£. +(!-«)'/£„] 


Using  the  probability  matrices  achieved  empirically  an  appropriate  range  for  b  can  be  found  so  that  0  < 
EMC  <  1.  From  the  calculation  sumfw}  =  1  for  all  directions,  maxfw}  =  0.1054  for  directions  1  to  8, 
and  maxfw}  =  0.1719  for  direction  9  (standing  in  place).  Substituting  that  we  get:  fig=o  =  7.5913,  for 
direction  1  to  8,  and  fg=o  =  4.6545  for  direction  9,  and  fg=i  =  0.8  for  all  directions.  Putting  that  into 
/?,  we  get :  /?  =  [1.1183  •  g  +  0.1317]'1  for  direction  1  to  8,  and  /?  =  [1.0352  •  g  +  0.2148]'1  for  direction 
9.  So  now  we  have  a  formula  that  tells  us  what  values  to  give  b  for  a  given  value  of  g.  Since  0<g<f 
a  graph  of  b  vs.  g  can  be  plotted,  as  in  the  graph  below,  and  a  valid  range  of  values  for  b  can  be  found. 
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Figure  15:  b  vs.  g  for  the  direction  1-8  (solid  line)  and  direction  9  (dashed  line! 


As  can  be  seen  from  the  graph,  a  valid  range  of  values  for  b  is  about  [0..8]  for  g=0  and  [0..1]  for  g=  1. 
Since  /8g=i  =  0.8  for  all  directions,  memory  values  of  around  0.8  should  give  the  optimal  performance 
for  g=  1,  regardless  of  the  scenery.  Since  the  targets  in  the  tested  IR  sequences  move  at  sub-pixel 
velocities,  mainly  0.2-0. 3  [ppf],  most  of  the  time  they  are  on  direction  9  (standing  in  place),  thus  we 
expect  the  algorithm  to  be  most  effective  for  />„=/  =  0.8  and  pg=o  close  to  4.65. 


Finding  the  optimal  V  s  for  the  different  scenes  will  be  done  by  optimizing  an  algorithm  score,  given 
by  a  metric  that  will  be  defined  in  the  following  section. 
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5  System  evaluation 

5.1  Evaluation  of  the  temporal  algorithm  on  synthetic  data 

5.1.1  Synthetic  IR  frames  creation 

In  order  to  evaluate  the  performance  of  the  spatial  and  temporal  tracking  algorithm,  synthetic  temporal 
profiles,  which  simulate  different  types  of  clutter  and  background  behavior,  are  created.  Target  signal 
is  implanted  into  these  background  signals  in  order  to  simulate  target  traversing  both  clutter  and  noise 
dominated  scenes.  [11]  states  that  the  temporal  noise  is  closely  matched  to  white  gaussian  noise 
( WGN ),  therefore  WGN  at  various  SNRs  is  used  to  test  the  algorithm.  SNR  is  defined  as: 


(5.1) 


snr=^L 

<J 


Where  a2  is  the  noise  variance,  MaxT  is  the  target's  maximum  peak  amplitude. 


Figure  16  below  shows  the  different  types  of  signals  used  to  test  the  algorithm. 


Figure  16:  Synthetic  background  signals. 
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The  type  1  signal  simulated  relatively  fast  and  small  clutter  formation  passing  through  a  pixel.  The 
type  2  signal  simulates  slowly  entering  clutter;  Type  3  is  the  symmetrical  slowly  existing  clutter.  Type 
4  simulates  a  noise  dominated  scene  in  which  the  base  time  line  is  constant;  the  type  5  signal  serves  as 
reference  to  the  best-case  scenario  which  comprises  a  constant  zero-mean  base  line. 

Target  temporal  profiles  are  characterized  by  a  rapid  rise  and  fall  pattern.  This  behavior  might  be 
modeled  by  a  half  sine  or  triangular  shape  as  shown  in  Figure  17. 
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Figure  17:  Synthetic  target  examples. 


The  base  width  corresponds  to  target  velocity.  Simulations  show  that  there  are  no  significant 
performance  differences  between  the  sine  and  triangular  shaped  targets  [1]  . 


Figure  18  on  the  next  page  shows  the  various  background  models  with  the  sine  shape  implanted  at  an 
SNR  of  4. 
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Figure  18:  Example  of  synthetic  signals  with  the  implanted  sine  target,  SNR=4. 
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5.1.2  Examination  of  the  temporal  algorithm  on  synthetic  data 

This  section  demonstrates  the  algorithm's  operation  on  the  synthetic  data  described  in  the  previous 
section.  The  following  parameters  affect  the  algorithm's  performance: 

1 .  The  background  type. 

2.  The  SNR,  which  is  based  on  noise  variance  and  the  target's  width  and  amplitude. 

3.  Parameters  of  the  windowing  procedure:  the  window  sizes  to  estimate  the  background  baseline 
DC,  the  grouping  spatial  window  size  to  convert  sub-pixel  target  velocity  to  the  pixel  target 
velocity  at  the  frame  (as  an  input  to  the  DPA ),  size  of  short  term  variance  at  each  sample  and  at 
each  grouping  and  the  step  size  of  each  (Overlapping). 

The  following  subsection  describes  the  dependence  of  the  performance  of  the  algorithm  on  these 
factors. 

5.1.2. 1  The  background  type 

The  background  type  affects  mostly  the  DC  estimation  capabilities  of  the  algorithm.  It  is  expected  that 
signals  having  constant  DC  level  (signals  of  Type  4  and  5)  and  signals  having  slowly  changing  DC 
(signal  of  Type  2  and  3)  would  be  the  easiest  in  terms  of  DC  estimation,  since  the  linear  regression  is 
capable  of  estimating  the  two  parameters  of  the  linear  model.  Signals  of  Type  1  are  the  most 
problematic  since  their  DC  is  not  overall  fitting  a  linear  model,  but  depends  on  piecewise  DC 
matching  windows  sizes  as  will  be  explained  later. 

Figure  18  until  Figure  28  illustrate  the  algorithm's  operation  on  the  various  signal  types.  The  figure 
shows  the  results  of  signals  with  and  without  implemented  target.  The  DC  signal  is  plotted  as  well  as 
the  estimated  variance  values  which  were  calculated  after  subtracting  the  estimated  DC  from  the 
signal.  The  simulation  was  run  for  DC  window  of  20  samples,  DC  overlap  of  50%,  sub  temporal 
profile  of  15  samples,  overlap  between  sub  profiles  of  5  samples  and  SNR  of  4.  The  target  width  is  10 
samples. 
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Figure  19:  Example  of  the  temporal  algrithm  operation  on  the  Typel  signal  with  target. 

As  can  been  seen  in  Figure  19  the  increase  in  the  variance  of  sub-profiles  2,  8  and  9  occurs  due  to  the 
imprecise  DC  estimation  of  the  background.  This  case  simulates  a  cloud  entering  and  exiting. 
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Nevertheless,  the  variance  score  of  the  target  (sub-profile  5,  6)  is  still  much  higher  than  sub-profiles  2, 
8  and  9.  The  variance  of  the  other  sub-profiles  (1,  3,  4,  7)  is  close  to  zero. 
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Figure  20:  Example  of  the  temporal  algrithm  operation  on  the  Type2  signal  with  target. 
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Figure  21:  Example  of  the  temporal  algrithm  operation  on  the  Type3  signal  with  target. 

The  DC  estimation  for  signals  Type  2  and  3  is  precise  since  the  signal  has  a  linear  model.  The  variance 
increases  significantly  when  the  target  passes  thru  the  pixel  and  is  close  to  zero  at  other  times. 
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Figure  22:  Example  of  the  temporal  algrithm  operation  on  the  Type4  signal  with  target. 


Figure  22  and  Figure  23  show  a  similar  behavior  of  signal  for  different  DC  levels.  As  expected,  the 
variance  of  each  of  signal  (Type  4  and  Type  5)  is  the  same. 
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Figure  23:  Example  of  the  temporal  algrithm  operation  on  the  Type5  signal  with  target. 


The  next  figures  show  the  results  of  temporal  processing  (variance  estimation)  for  the  different  signal 
types,  in  the  case  where  no  target  is  present.  Not  all  of  sub-profiles  are  shown,  since  the  exhibited 
variance  values  are  around  zero. 
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Figure  24:  Example  of  the  temporal  algrithm  operation  on  the  Typel  signal  without  target. 


By  analogy  to  Figure  19,  the  variance  increases  at  sub-profiles  where  the  cloud  enters  and  exits.  Such 
cases  (without  target)  may  cause  false  alarms  (FA). 
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Figure  25:  Example  of  the  temporal  algrithm  operation  on  the  Type2  signal  without  target. 


Figure  26:  Example  of  the  temporal  algrithm  operation  on  the  Type3  signal  without  target. 
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Figure  27:  Example  of  the  temporal  algrithm  operation  on  the  Type4  signal  without  target. 


Figure  25  to  Figure  28  show  that  the  temporal  processing  score  is  close  to  zero  for  signals  without  a 
target. 
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Figure  28:  Example  of  the  temporal  algrithm  operation  on  the  Type5  signal  without  target. 


At  this  point  a  simple  means  of  performance  evaluation  is  the  ratio  of  the  score  obtained  from  the 
synthetic  signal  which  contains  the  target  and  the  same  signal,  but  without  a  target  (Target/Noise  or 
T/N  ratio).  Table  1  shows  the  T/N  ratio  of  the  different  signal  types  obtained  by  averaging  500 
rehearses. 


It  is  important  to  state  that  the  T/N  ratio  is  not  the  metric  used  to  evaluate  the  algorithm  as  will  be 
shown  later. 


Signal  Type 

T/N  Ratio 

T/N  Ratio  [14] 

1 

1.8098 

1.72 

2 

10.9558 

4.34 

3 

10.9658 

4.41 

4 

10.9659 

4.29 

5 

10.9659 

4.41 

Table  1:  Target/Noise  ratio  for  varios  signal  types. 


As  expected,  the  performance  for  signal  Type  1  is  the  worst  among  the  different  signals.  Signals  2-5 
all  have  similar  performance.  If  we  compare  between  our  sub-temporal  processing  algorithm  and 
between  a  temporal  processing  algorithm  as  described  at  [14]  it  can  be  clearly  seen  (Table  1,  2nd 
column)  that  the  performance  has  increased  by  at  least  a  factor  of  two  for  signals  2-5,  and  has  an 
insignificant  increase  for  signal  of  Type  1 . 
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5.1.2.2  The  SNR 

As  in  many  other  applications,  the  SNR  is  an  important  factor  which  affects  the  temporal  algorithm's 
performance.  The  higher  the  SNR,  the  better  the  algorithm's  performance  is  expected  to  be  since  both 
the  DC  and  variance  estimation  will  be  more  accurate.  The  results  are  shown  in  Figure  29. 


T/N  Ratio  Vs.  SNR 


Figure  29:  T/N  ratio  Vs.  SNR  for  different  signal  types. 

The  algorithm  has  similar  response  for  signals  of  Type  2-5,  in  agreement  with  the  expectations  -  the 
performance  increases  as  the  SNR  increases.  The  performance  for  signal  Type  1  behaves  differently  - 
it  increases  with  the  SNR  but  at  a  slower  rate  than  signals  2-5  and  decreases  later  on.  Roughly,  the 
curve  of  signal  Type  1  is  a  scaled  version  of  the  curves  of  signals  of  Type  2-5;  the  scaling  factor 
originates  from  the  inaccurate  DC  estimation,  as  will  be  detailed  later.  Since  the  DC  of  this  signal 
doesn't  fit  to  a  linear  model,  and  the  estimation  is  performed  in  piecewise  fashion,  the  size  and  the 
position  of  the  windows  used  to  perform  the  estimation  act  as  limiting  factor  on  the  performance. 

5.1.2.3  The  window  sizes 

The  window  size  for  the  baseline  DC  estimation,  as  well  as  the  window  size  for  the  short  term 
variance  at  the  sub-profile  has  a  great  impact  on  the  algorithm  performance. 

Larger  window  sizes  for  baseline  DC  estimation  should  improve  the  DC  estimation  in  cases  where  the 
background  changes  monotonically  (like  signals  of  types  2-4).  A  too  large  a  DC  estimation  window 
size  might,  in  some  cases,  lead  to  inaccurate  tracking  of  the  clutter  form  and  cause  high  false  alarm 
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rates  (e.g.,  like  in  Type  1  signals).  Thus,  for  background  profiles  the  optimal  window  size  is 
determined  by  the  background  type  -  for  noise  dominated  background  or  backgrounds  containing 
monotonically  changing  clutter,  larger  window  sizes  are  preferred;  for  background  containing  rapidly 
changing  clutter  shorter  window  is  preferred.  For  target  temporal  profiles,  the  larger  the  DC  window, 
the  higher  the  profiles  score  since  the  presence  of  the  target  peak  will  less  influence  the  DC  estimation, 
obviously  estimated  DC  which  tracks  the  target  form  is  highly  undesirable  since  it  leads  to  target 
suppression.  Thus  the  optimal  DC  window  regarding  the  overall  algorithm  performance  is  the  one 
which  closely  tracks  background  changes  on  one  hand,  but  is  large  enough  not  to  track  the  target  peak. 

The  sub-temporal  profile  length  should  be  matched  to  the  target  sub-pixel  velocity  which  expressed  as 
the  base  width  of  the  peak  of  target  profile  and  the  sub-pixel  velocity,  although  there  is  no  acute  need 
for  an  exact  match.  The  short  term  variance  window  size  at  the  sub-profile  should  be  matched  to  the 
target  rise  time,  although  there  is  no  acute  need  for  an  exact  match  as  at  the  sub-temporal  profile 
length.  A  sub-profile  length  which  is  larger  than  the  target  width  will  disable  the  ability  to  track/detect 
target  with  a  sub-pixel  velocity.  Alternatively,  a  sub-profile  length  which  is  smaller  than  the  target 
width  will  allow  too  few  samples  at  the  sub-profile. 

A  short  time  variance  window  which  is  larger  than  the  target  rise  time  will  result  in  lower  score  for  the 
target  profile,  since  the  variance  calculated  on  each  window  is  normalized  by  the  window's  length. 
Thus  for  target  profile  the  optimal  variance  window  size  is  expected  to  be  less  than  or  equal  to  the  sub- 
profile  length,  but  not  larger.  In  fact,  the  shorter  the  window,  the  higher  the  score  of  target  profile.  On 
the  other  hand,  a  short  variance  window  is  more  sensitive  to  random  noise  spikes  in  temporal  profile 
dominated  by  noise.  Therefore  the  optimal  variance  window  size  for  noise  dominated  profiles  will 
tend  to  be  as  large  as  possible  in  order  to  diminish  the  effect  of  noise  spikes.  The  optimal  variance 
window  for  the  overall  algorithm’s  performance  is  the  one  which  best  compromises  the  need  to 
enhance  the  target  profile  score  (as  short  as  possible)  and  the  need  to  suppress  the  noise  short-term 
fluctuations  (as  long  as  possible). 

Another  impact  is  the  overlap  window  between  sub-profiles.  The  overlap  window  should  allow  for  the 
compensation  of  low  sub-pixel  velocity  that  derives  a  small  sub-profile  length.  Another  aspect  of  the 
overlap  window  is  the  need  to  create  more  sub-profiles,  as  defined  at  equation  (4.4),  since  more  sub¬ 
profiles  aids  to  achieve  a  more  accurate  tracking  estimation. 
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5.2  Evaluation  of  the  DP  A  on  real  data 

As  stated  in  section  3.2. 3.2,  the  performance  metric  is  defined  as  follows: 

The  block  SNR  is  given  as: 

(5.2) 

where  vy  is  the  set  of  pixels  belonging  to  the  (ij/b  block,  M  is  a  set  containing  the  five  pixels  with  the 
highest  gray  level  in  that  block,  and  a  is  the  standard  deviation  of  the  block  pixels.  The  formula 
performs  a  subtraction  between  the  expectation  value  of  the  highest  pixels  (target)  and  the  expectation 
value  of  the  rest  of  the  pixels  (background),  divided  by  the  standard  deviation  of  block  pixels.  Since 
the  probability  matrices  introduce  the  influence  of  target  pixels  on  adjacent  pixels,  these  influenced 
pixels  might  accumulate  higher  values  than  unaffected  pixels  (background),  and  can  be  regarded  as 
target  pixels.  This  might  lower  the  expectation  value  of  the  target,  but  will  lower  the  standard 
deviation  of  the  background,  since  these  high  pixels  are  higher  than  the  statistics  of  the  background,  to 
a  more  accurate  one. 


Block  _SNR(i,j)  = 


£[v,JeM]-£[v,jSEM] 


The  algorithm  score  is  given  as: 


(5.3) 


Score  =  ■ 


Block  _  SNR  (i,  /) 


(i,j)=TB 


{Block _SNR(i,jj]  . 


(i,j)=NTB 


{Block  _SNR(i,j)}{lJ)=N 


Using  this  metric  for  the  algorithm  score,  the  EMC  value  for  optimized  algorithm  work  can  be  found, 
in  both  cases  of  g=0  (the  max  part)  and  g=l  (the  sum  part)  in  the  accumulated  score  equation. 

The  DPA  algorithm  has  been  applied  to  several  IR  sequences  containing  different  scenes  and  clutter 
degree.  Each  sequence  contains  95  to  100  frames,  and  the  algorithm  ran  up  to  25  frames.  The  results 
are  shown  for  two  IR  sequences  -  na23a  and  npa.  A  single  frame  from  the  IR  sequences  showing  the 
targets  locations  are  shown  in  the  figure  below  (in  white  rectangles).  The  na23a  sequence  has  a  single 
target  moving  in  clear  sky  from  right  to  left  at  v  ~  0.3  [ppf].  The  npa  sequence  has  two  targets  moving 
at  v  ~  0.2  [ppf],  the  left  target  is  engulfed  in  clouds  moving  from  bottom  to  top  and  the  right  target  is 
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moving  in  scarce  clouds.  The  algorithm  will  be  applied  on  the  surrounding  of  the  left  target,  since  it  is 
of  more  interest  due  to  its  severe  clutter  condition. 


(a) 


(b) 


Figure  30:  A  single  frame  from  la)  na23a  lb)  nva  sequence  showing  the  target’s  locations  (white  rectangles). 


The  DP  A  is  applied  for  two  different  cases:  g  =  0  and  g  =  1,  for  different  values  of  b,  to  check  the 
effect  of  each  part  of  the  accumulated  score  formula  on  the  algorithm  results.  The  metric  applied 
afterwards  is  used  to  find  optimal  values  of  b' s.  Afterwards,  the  effect  of  Max_Numbers  (the  number 
of  pixels  with  the  highest  scores  considered  to  be  'target'  in  the  SNR  formula)  on  the  algorithm  score  is 
checked,  to  make  sure  the  metric  gives  reliable  score  performance  for  the  different  EMC s  (b) . 
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5.2.1  The  Preprocessing  Stage  (anti-mean) 

The  algorithm  starts  by  whitening  the  input  IR  sequence.  According  to  the  introduced  score  metric,  the 
sequence  is  divided  into  blocks.  The  target  block  and  the  eight  surrounding  blocks  are  then  whitened 
(anti-mean)  to  reduce  the  clutter  and  emphasize  the  target,  as  can  be  seen  in  the  figure  below  from  the 
1st  frame  of  the  na23a  sequence. 


Figure  31:  (at  1st  frame  of  the  original  na23a  sequence  divided  into  blocks,  (b)  1st  frame  of  target  block  (target  at 


center  of  frame),  tc)  1st  frame  of  target  block  after  the  preprocessing  (ANTI-MEAN)  Stage. 


The  process  shown  in  the  Figure  3 1  is  repeated  for  the  entire  sequence  before  proceeding  to  the  next 
stage  of  DPA  (The  frames  can  also  be  processed  individually  for  a  real-time  implementation). 


5.2.2  na23a  sequence 

5.2.2. 1  Preliminary  results  ( sampling  =  1) 

We  start  by  showing  the  result  for  the  na23a  sequence,  g  =  0.  Figure  32  shows  the  algorithm  score  for 
the  range  of  b  =  [0...8]  at  jumps  of  0.2.  The  graph  shows  the  scores  for  the  last  six  frames,  20-25  in 
this  case,  for  two  reasons:  1.  some  frames  might  be  noisier  than  the  others;  2.  we  wish  to  show  the 
accumulated  score  effect.  The  first  reason  might  cause  the  target  to  be  dimmer  compared  to  the  clutter 
or  disappear.  In  that  case,  the  memory  persistence  and  the  accumulation  of  the  score  from  frame  to 
frame  helps,  leading  to  the  second  reason,  the  accumulated  score  helps  overcome  noisy  frames,  and 
'accumulates'  SNR  for  the  dim  target.  Looking  at  the  last  frames  helps  to  see  whether  the  effect  of 
accumulation  is  enough  for  the  number  of  frames  processed,  or  whether  more  frames  need  to  be 
processed.  In  previous  research,  [20]  [21]  ),  the  algorithm  was  run  over  10  to  20  frames,  and  20  was 
found  to  be  better,  hence  25  frames  were  taken  in  this  case,  to  be  on  the  safe  side.  The  graph  shows  a 
peak  at  around  b  =  4.8  for  all  frames.  The  rise  at  about  b  =  6  is  irrelevant  since  the  algorithm  was 
unable  to  track  at  EMC  of  about  6  or  above.  It  can  be  seen  that  above  that  EMC,  all  the  frames  scores 
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converge,  meaning  that  the  EMC  is  too  high  and  the  weight  of  the  current  frame  is  insignificant  to  the 
accumulated  score.  It  seems  that  the  relevant  range  for  EMC  can  be  narrowed  to  b  =  [0. .  .6],  for  the  g  = 
0  case. 


Frame:  22  Frame:  21  Frame:  20 

io|H  >  ■■  10 
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(a)  (b) 

Figure  32:  (a)  The  algorithm  score  for  the  na23a  sequence,  g  =  0.  Max  Numbers  =  5.  b  =  [0...81,  and  six  last  frames 

tbi  Images  of  the  six  last  frames  after  the  DPA  algorithm. 

A  rise  in  the  algorithm  score  is  expected  as  the  frames  advance.  It  can  be  seen  the  graph  Figure  32  (a), 
that  the  score  of  the  20,  23  frames  is  lowest  for  b  <  3.5.  That  is  due  to  the  fact  that  these  frames  are 
noisy,  as  can  be  seen  in  Figure  32  (b).  Increasing  the  EMC  improves  the  score  in  them  -  more  memory 
persistence  and  less  weight  to  current  noisy  frame.  Since  the  target  moves  at  v  ~  0.3  [ppf\  (stays  in  the 
same  pixel  between  pixels  most  of  the  time)  the  expected  peak  was  at  around  b  =  4.65  in  agreement 
with  the  results. 

Figure  33  shows  the  graphs  of  the  standard  deviation  of  the  NTB,  the  expectation  value  of  the  NTB 
SNR,  the  TB's  SNR  (all  vs.  b ),  and  the  last  frame  (25)  for  various  values  of  b.  For  b  =  5.4  the  last  frame 
shows  high  values  for  the  trajectory  of  the  target,  meaning  that  the  EMC  is  very  high  and  every  pixel 
traversed  by  the  target  retained  it's  high  value  (Figure  33  (d)).  Thus,  EMC's  higher  than  that  will 
prevent  the  algorithm  from  tracking  the  target,  since  the  trail  pixels  and  the  pixel  of  origin  will  gain 
higher  accumulated  score,  as  will  be  seen  later  on  (Figure  51). 

We  continue  by  showing  the  results  for  the  g  =  1  case  in  Figure  34.  As  can  be  seen  from  the  algorithm 
score,  the  relevant  range  of  b  is  [0...1],  after  which  the  scores  of  the  various  frames  converge 
regardless  of  the  EMC.  The  TB’s  SNR  can  be  seen  to  fall  at  around  b  =  1.  The  algorithm  score  starts  at 
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its  maximum  (mean  value)  for  no  memory  and  drops  as  the  EMC  rises,  until  b  =  1 .  This  means  that  in 
this  case,  the  algorithm  is  not  needed  for  the  detection. 


(c) 


(d) 


Figure  33:  tat  Standard  deviation  of  NTITs  SNR ,  (b)  Expectation  value  of  NTB's  SNR ,  (c)  TB's  SNR ,  (d)  Last 

processed  frame  i25th)  for  various  b's. 


Figure  35  shows  the  influence  of  the  EMC  on  the  pixels  of  the  last  frame.  In  the  g  =  1  case,  when 
starting  to  raise  the  EMC  to  too  high  values  (above  b  =  1),  no  trail  appears  as  in  the  g  =  0,  but  a  'snow 
ball'  effect  starts  to  build.  The  'snow  ball'  grows  until  the  target  is  engulfed  in  it,  and  no  tracking  is 
possible.  Due  to  the  probability  matrix  and  the  summation,  pixels  close  to  where  the  target  passes 
grow  in  their  value.  If  the  EMC  is  high,  these  adjacent  pixels  will  also  receive  high  accumulated 
scores,  and  so  the  ball  grows  from  frame  to  frame.  The  ANTI-MEAN  component  weakens,  and  its 
weight  decreases.  Since  the  current  frame  influence  is  negligible,  the  main  influence  comes  from  the 
previous  frames.  This  causes  the  graph  to  have  constant  spaces  between  the  frames.  Since  the 
expectation  values  of  the  blocks  grow  as  the  frames  progress,  the  SNR  decreases  (The  target 
expectation  value  grows,  but  the  'background'  expectation  value  also  grows,  so  that  the  numerator 
decrease  in  the  SNR  formula).  There  is  a  convergence  to  an  asymptotic  value,  where  frame  20  gets  the 
highest  score,  whilst  frame  25  gets  the  lowest  score  (due  to  the  lower  SNR  as  frames  progress). 
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(a)  (b) 


(c) 


(d) 


Figure  34:  (a)  Algorithm  score  for  g  =  1,  (b)  Standard  deviation  of  NTB's  SNR ,  (c)  Expectation  value  of  NTB's  SNR , 

(d)  TB's  SNR . 


To  conclude,  a  valid  range  of  EMC,  b  =  [0...0.8],  has  been  found  in  which  the  algorithm  is  able  to 
track  and  detect  the  target  correctly;  this  concurs  with  the  theoretical  range. 


10  20  30  10  20  30  10  20  30  10  20  30  10  20  30  10  20  30 

b:  1.2  b:  1.4  b:  1.6  b:  1.8  b:  2  b:  2.2 


10  20  30  10  20  30  10  20  30  10  20  30  10  20  30  10  20  30 


Figure  35:  Last  processed  frame  (25th!  for  b  =  [0...11. 

S.2.2.2  The  sampling  variable 

Since  the  core  of  the  algorithm  is  the  usage  of  the  punishment  matrices,  the  target  has  to  move  at  a 
velocity  of  at  least  around  1  \ppf\ .  If  the  target  in  the  sequence  moves  at  lower  velocities,  v  ~  0.3 
\ppf\  in  our  case,  the  punishment  matrices  are  used  only  every  three  frames  roughly.  Since  that  is  the 
case,  sampling  of  the  sequence  has  been  suggested,  so  that  the  target  moves  at  higher  speed.  The  first 
simulation  was  done  for  sampling  =  4  giving  the  target  a  velocity  of  v  ~  1.2  [ppf\.  To  keep  the  target  in 
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the  'target  block1,  a  larger  block  size  of  30x80  was  used  instead  of  30x30.  The  algorithm  score  and  the 
last  six  frames  are  shown  in  Figure  36,  for  the  g  =  0  case. 

The  score  achieved  is  lower  in  this  case  due  to  the  noisy  frames  compared  to  the  last  six  frames  in  the 
sampling  =  1  case.  It  should  be  noted  that  by  sampling  the  sequence  the  algorithm,  different  frames 
were  processed  that  might  differ  in  their  noise  degree.  Nevertheless,  the  score  shows  the  effect  of 
accumulation  as  the  frames  progress  -  the  peak  is  distinguishable  (above  frame  20),  and  is  around 
b  =  5.0  for  the  last  three  frames.  An  increase  in  the  EMC  is  expected  since  the  target  now  moves  faster. 
It  seems  that  the  algorithm  needs  around  20  frames  for  the  dim  point  target  to  accumulate  enough  SNR 
to  be  distinguishable  from  the  clutter. 
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Figure  36:  (a)  sampling.  =  4:  The  algorithm  score  for  g  =  0,  Max  Numbers  =  5,  b  =  [0...81,  and  six  last  frames. 

ibl  Images  of  the  six  last  frames  after  the  DPA  algorithm. 


Figure  37  shows  the  graphs  of  the  standard  deviation  of  the  NTB,  the  expectation  value  of  the  NTB 
SNR,  the  TB's  SNR  (all  vs.  b),  and  the  last  found  target  track  for  sampling  =  4  (the  pink  portion  of  the 
track  is  the  target  track  for  sampling  =1). 
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Figure  37  :  sampling  =  4  :  (a)  Standard  deviation  of  NTB's  SNR ,  (b)  Expectation  value  of  NTB's  SNR ,  (c)  TB's  SNR , 

(d)  the  target's  track  (the  pink  track  is  the  target  track  for  sampling  =  1) 


Comparing  Figure  33  to  Figure  37  shows  that  the  TB  &  NTB  SNR's  behave  likely  for  sampling  =  1  and 
only  TB  SNR  has  a  peak  for  sampling  =  4.  This  shows  that  the  penalty  matrices  help  distinguish 
between  target  and  clutter,  and  that  the  algorithm  needs  targets  at  around  v  =  1  \ppf\  to  work 
effectively.  In  the  case  of  g  =  1,  the  algorithm  score  is  lower  compared  to  the  one  achieved  for 
sampling  =  1,  due  the  noisy  last  frames,  as  in  the  g  =  0  case. 

The  next  sub-sections  deals  with  the  issue  of  using  values  of  the  Max_Numbers  parameter  that  will 
correctly  take  only  target  pixels  as  the  highest  pixels. 

5.2.2.3  Finding  Max  Numbers 

As  specified  before,  the  parameter  Max_Numbers  controls  the  number  of  pixels  considered  as  'target' 
pixels,  and  thus  affects  the  result  of  the  SNR  of  the  blocks  and  consequently  the  algorithm  score.  So  far 
Max_Numbers  =  5  has  been  used.  Instead  of  using  an  arbitrary  number  of  highest  pixels,  a  value  for 
Max_Numbers  can  be  found  via  optimization  of  the  algorithm  score,  for  the  two  cases  of 
g  =  0  and  g  =  1 .  A  value  of  b  =  4.8  and  sampling  =  4  was  taken.  The  optimization  was  first  run  for  the 
g  =  0  case.  Figure  38  shows  a  peak  of  mean  algorithm  score  for  Max_Numbers  =  3  (there  is  a  peak  for 
all  frames  but  the  19th  frame  where  there  is  not  enough  accumulation  as  discussed  before).  A  decrease 
in  the  TB's  SNR  can  also  be  seen  from  the  figure  for  Max_Numbers  >  3. 
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Figure  38:  (a)  The  algorithm  score  vs.  Max  Numbers ,  (b)  The  TB’s  SNR  vs.  Max  Numbers 


Figure  50  shows  the  TB's  SNR  vs.  b.  The  graph  is  separated  to  three  sections  according  to  the  resulting 
highest  pixels  in  the  last  frame  -  (1)  Target  pixels,  (2)  Trail  &  Target  pixels,  (3)  Pixel  of  origin  and 
Trail  pixels.  The  figure  in  the  next  page  elaborates  on  the  section  separation.  Note  that  the  resulting 
separation  concurs  with  the  relevant  range  of  b  =  [0...6],  The  graph  helps  to  distinguish  a  good  range 
in  which  the  highest  pixels  taken  as  'target  pixels'  indeed  belong  to  the  target. 


Figure  39:  The  TB's  SNR  vs.  b 
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In  the  case  of  the  5  highest  pixels,  the  score  takes  only  the  target  pixels  up  to  b  =  5.6.  Above  that  value 
the  trail  pixels  start  to  accumulate  higher  scores  than  the  target  pixels,  until  the  pixel  of  origin  also 
accumulates  a  higher  score  ( b  =  6.0).  In  the  case  of  3  highest  pixels,  the  score  takes  only  the  target 
pixels  up  to  b  =  5.8.  Higher  EMC s  causes  non-target  pixels  to  be  higher  than  the  target  pixels  as 
before.  In  this  case  of  g  =  0  and  sampling  =  4,  the  algorithm  was  able  to  track  and  locate  the  target  up 
to  b  =  6.0.  Using  that  and  the  figure  below  leads  to  the  conclusion  that  a  good  range  of  EMC s  would 
be  b  =  [4.4  ...  6].  This  range  contains  the  theoretical  EMC  for  non-moving  targets. 


b=  4.4  b=  4.6  b=  4.8 


20  40  60  80  20  40  60  80  20  40  60  80 
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Figure  40:  The  last  frame  of  the  algorithm,  highest  3  pixels  emphasized 


The  discussion  continues  now  to  the  g  =  1  case.  The  last  frames  of  the  algorithm  and  the  target's  track 
vs.  b  are  shown  in  Figure  52.  It  can  be  clearly  seen  that  for  b  =  [0...0.8]  there  is  only  one  high  pixel 
belonging  to  the  target  (surrounded  by  white  circle).  The  algorithm  tracks  the  target  in  the  range  of 
b  =  [0.2. .  .0.8],  These  results  suggest  that  Max_Numbers  =  1  and  b  =  [0.2  . . .  0.8]  should  be  used  in  the 
case  of  g  =  1 . 

b=  0.2 
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Figure  41:  The  last  frame  of  the  algorithm,  highest  5  pixels  emphasized 
(target  pixel  surrounded  by  a  white  circle)  vs.  b 


b=  0.8 


b=  0.4  b=  0.6 


The  algorithm  score  using  Max_Numbers  =  1  is  shown  in  Figure  53  Below.  A  peak  in  the  mean 
algorithm  score  can  be  clearly  seen  for  b  =  0.6  and  the  best  tracking  is  achieved  for  b  =  0.8  (not 
shown)  in  accordance  with  the  theoretical  value.  The  algorithm  score  is  higher  in  this  case  than  the 
g  =  0  case  for  sampling  =  4,  and  higher  than  both  cases  for  sampling  =  1 .  This  result  suggests  that  the 
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algorithm  performs  better  for  faster  targets,  given  correct  value  of  Max_Numbers  according  to  the  case 
under  investigation. 


Figure  42:  Algorithm  score  for  g  =  1  case.  Max  Numbers  -  1,  sampling  -  4 

5.2.2.4  Summary 

Results  of  the  na23a  sequence  have  been  shown,  followed  by  discussion  and  further  results  of  the 
Max_Numbers  and  the  sampling  parameters.  Preferable  values  for  Max_Numbers  were  found,  3  for  the 
g  =  0  case,  and  1  for  g  =  1  case.  Relevant  ranges  of  b's  were  also  found  for  each  case,  b  =  [4.4. .  .6]  for 
the  g  =  0  case,  b  =  [0.2.  ..0.8]  for  the  g  =  1  case.  The  core  of  the  algorithm  is  the  punishment  matrices. 
If  the  target  is  moving  at  a  sub-pixel  velocity  these  matrices  are  not  used  often  and  the  algorithm  will 
not  perform  at  its  peak.  In  order  to  find  the  velocity  for  which  the  algorithm  works  best,  the  algorithm 
has  been  run  over  the  sequence  for  sampling  =  [1...4]  for  both  cases.  For  each  sampling  the  highest 
score  in  the  relevant  ranges  of  b's  was  chosen.  It  should  be  noted  that  for  the  algorithm  to  accumulate 
effectively,  at  least  20  frames  have  to  be  processed,  thus  limiting  the  amount  of  sampling  that  can  be 
done,  depending  on  the  number  of  available  sequence  frames.  Also,  different  frames  are  processed  for 
the  different  samplings,  so  a  comparison  might  not  be  precise.  The  results  are  shown  in  Figure  43. 


Figure  43  :  Algorithm  score  vs.  sampling,  (a)  g  =  0  case.  Max  Numbers  =  3  tbi  g  =  1  case.  Max  Numbers  =  1 


The  algorithm  has  a  preferable  target  velocity  at  around  v=0.6  [ppf\. 
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5.2.3  npa  sequence 

The  sequence  contains  a  target  moving  at  v  ~  0.2  [ppf\  in  the  proximity  of  clouds.  In  this  case,  the 
cloud's  edges  in  the  target  block  pose  a  challenge  since  they  behave  like  targets,  and  receive  a  high 
score  from  the  whitening  preprocessing  stage.  In  this  sequence,  tracking  and  detection  was  achieved 
for  sampling  >  3.  Hence,  detailed  results  for  lower  sampling  values  will  be  skipped  and  only  detailed 
results  for  sampling  =  3  will  be  shown.  The  summary  sub-section  will  show  scores  for  the  various 
sampling. 


5.2.3. 1  Results  {sampling  =  3) 

Figure  44  shows  the  following  graphs  for  the  g  =  0  case  :  the  algorithm  score,  the  standard  deviation  of 
the  NTB,  the  expectation  value  of  the  NTB  SNR,  the  TB's  SNR  (all  vs.  b),  and  the  last  frame  (25)  for 
various  values  of  b. 
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Figure  44  :  sampling  =  3  :  (a)  Algorithm  score  for  the  twa  sequence,  g  =  0.  Max  Numbers  =  3  (b)  Standard  deviation 

of  NTB's  SNR  (c)  Expectation  value  of  ATfi’s  SNR  (d)  TB\  SNR 


The  resulting  scores  are  negative  for  all  the  b's  in  the  range.  This  is  due  to  the  cloud  edges  at  the 
bottom  of  the  target  block  that  get  high  accumulated  scores  (Figure  45  (a)).  This  causes  the  TB's  SNR 
to  be  low.  Cloud  edges  in  the  NTB  give  rise  to  their  SNR,  leading  to  a  negative  algorithm  score. 
Nevertheless  the  algorithm  is  able  to  track  and  detect  the  target  for  b  =  [2. .  .4.8],  Higher  EMC s  lead  to 
cloud's  edge  having  higher  score  than  the  target  itself  in  the  TB,  and  no  tracking. 
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Figure  45  :  (a)  The  last  frame  of  the  algorithm,  highest  5  pixels  emphasized  vs.  b 
(target  pixel  surrounded  by  a  white  circle),  (b)  The  target  track  vs.  b 


The  results  for  the  g  =  1  have  also  been  negative  as  the  g  =  0  case.  Nevertheless,  the  algorithm  has 
been  able  to  track  and  detect  the  target  for  b  =  [0. .  .0.8]. 

5.2.3.2  Summary 

Results  of  the  npa  sequence  have  been  shown.  Preferable  values  for  Max_Numbers  were  found,  3  for 
the  g  =  0  case,  and  1  for  g  =  1  case.  Relevant  ranges  of  b' s  were  also  found  for  each  case,  b  =  [2. .  .4.8] 
for  the  g  =  0  case,  b  =  [0...0.8]  for  the  g  =  1  case.  The  algorithm  performed  best  for  sampling  =  3, 
where  the  effective  target  velocity  was  at  around  v=0.6  \ppf\  as  in  the  na23a  sequence. 

5.2.4  DPA  Results  Summary 


Relevant  ranges  of  b's  for  which  the  algorithm  was  able  to  track  and  detect  the  target,  and  the  optimal 
Max_Numbers  values,  are  shown  in  Table  2. 


Accumulated  Score 

Relevant  range  of  b's  where  tracking  occurred 

Max  Numbers 

Part 

npa  sequence2 

na23a  sequence 

sum  part  (g  =  0) 

[2. .4. 8] 

[4.4. .6] 

3 

max  part  (g  =  1) 

[0..0.8] 

[0.2. .0.8] 

1 

Table  2:  EMC's  for  which  tracking  occurred  in  the  IR  sequences. 


2  Tracking  occurred  for  sampling  >  3 
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5.3  Evaluating  of  the  temporal  processing  algorithm  on  real  data 
5.3.1  Real  IR  sequences 

The  real  world  IR  image  sequences  from  Reference  [16]  are  used  as  a  means  for  evaluating  the 
temporal  algorithm.  The  movies  are  comprised  of  95  or  100  12  bit  infrared  frames.  The  sequences 
contain  raw  data  of  unresolved  targets  flying  around  Hanscom  AFB.  There  are  five  scenes  in  the 
available  dataset  containing  various  types  of  clutters  and  sky  as  well  as  targets  moving  at  different 
velocities. 


Figure  46  shows  a  single  frame  (frame  50)  of  each  of  the  IR  sequences  examined  in  this  work. 
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Figure  46:  Frame  50  of  each  real  data  IR  sequence. 
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Table  2  summarizes  the  number  and  nature  of  targets  for  each  IR  sequence,  as  well  as  the  background 
type  of  scene. 


IR  sequence  name 

Scene  description 

NPA 

two  targets  in  wispy  clouds 

J13C 

one  slow  target  in  clear  of  cloudy  scene 

NA23 

one  fast  target  in  bright  clouds 

J2A 

two  targets  in  fluffy  clouds 

M21F 

one  weak  target  in  hot  hazy  night  sky 

Table  3:  Description  of  the  IR  sequences. 


5.3.2  Evaluation  metrics 

In  order  to  evaluate  the  algorithm  a  new  metric  has  been  defined,  as  describe  at  section  3.2. 3.2.  Each 
frame  in  the  sequence  is  divided  into  blocks,  the  size  of  HxN  (30x30  were  used). The  algorithm  is  run 
over  9  blocks  -  Target  block  and  eight  adjacent  blocks.  The  SNR  of  the  target  block  (75)  and  its  eight 
adjacent  non-target  blocks  (NTB)  are  calculated.  Afterwards,  an  algorithm  score  is  calculated  based  on 
the  resulting  SNR' s. 


The  block  SNR  is  given  as: 


(5.4) 


Block  _  SNR(i,  j) 


where  vtj  is  the  set  of  pixels  belonging  to  the  (i,j)th  block,  M  is  a  set  containing  the  five  pixels  with  the 
highest  gray  level  in  that  block,  and  a  is  the  standard  deviation  of  the  block  pixels.  The  formula 
performs  a  subtraction  between  the  expectation  value  of  the  highest  pixels  (target)  and  the  expectation 
value  of  the  rest  of  the  pixels  (background),  divided  by  the  standard  deviation  of  block  pixels.  Since 
the  probability  matrices  introduce  the  influence  of  target  pixels  on  adjacent  pixels,  these  influenced 
pixels  might  accumulate  higher  values  than  unaffected  pixels  (background),  and  can  be  regarded  as 
target  pixels.  This  might  lower  the  expectation  value  of  the  target,  but  will  lower  the  standard 
deviation  of  the  background,  since  these  high  pixels  are  higher  than  the  statistics  of  the  background,  to 
a  more  accurate  one. 
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The  algorithm  score  is  given  as: 


(5.5) 


Score = 


Block  _ SNR (i,  j)  - E  [Block _  SNR (i,  y)}  . 


I  (i  J)=NTB 


{Block _SNR(i,j)} 


(/j)=jwb 


The  score  is  calculated  by  subtracting  between  the  target  block  SNR  and  the  expectation  value  of  the 
SNR  of  the  non-target  blocks,  divided  by  the  standard  deviation  of  the  non-target  blocks  SNR. 

The  final  grade  of  the  algorithm  serves  as  a  tool  for  comparison  between  the  suggested  temporal 
processing  algorithm  and  other  temporal  processing  algorithms  which  deal  with  the  same  problems. 
The  grade  evaluates  the  difference  between  the  score  of  the  block  containing  the  target  and  the 
expected  values  of  the  rest  of  the  blocks  in  the  image,  normalized  by  their  standard  deviation. 

5.3.3  Real  data  results 

The  real-world  IR  image  sequences  described  in  the  previous  section  were  used  to  evaluate  the 
temporal  tracking  algorithm.  The  algorithm  output  images  are  given  in  Figure  47  together  with  a 
representative  frame  from  each  sequence.  The  sequences  were  chosen  to  comprise  both  clutter  and 
noise  dominated  scenes.  The  parameters  of  each  simulation  were  chosen  to  be  the  optimal  set,  as  will 
be  explained  in  the  following  subsection.  The  target  tracks  are  seen  as  brighter  short  stripes;  in  some 
cases,  clutter  leakage  is  also  evident. 


Frame  70  from  IR  sequence  NPA  Result  lmage  of 

IR  sequence  NPA 
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Frame  70  from  IR  sequence  M21F 


Result  Image  of  IR  sequence  M21 F 


□ 


Frame  70  from  IR  sequence  J2A 


Result  Image  of  IR  sequence  J2A 


Frame  70  from  IR  sequence  NA23 


Result  Image  of  IR  sequence  NA23 


Figure  47  :  Output  images  of  the  temporal  tracking  algorithm. 


Figure  48  provides  a  closer  view  of  the  block  containing  the  target  of  each  output  image  together  with 
a  representative  target  temporal  profile. 
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Target  profile  from  IR  sequence  NPA 
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a.  Target  blocks  and  profiles  from  the  IR  sequence  NPA 


frame 


b.  Target  block  and  profile  from  the  IR  sequence  M2  IF 


Target  block  from  IR  sequence  J13C  -  Block  47 


Target  profile  from  IR  sequence  J13C 
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c.  Target  block  and  profile  from  the  IR  sequence  J13C 
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frame 


d.  Target  blocks  and  profiles  from  the  IR  sequence  J2A 


frame 


e.  Target  block  and  profile  from  the  IR  sequence  NA23 


Figure  48:  Real  data  results. 


Visual  evaluation  of  the  results  images  presented  in  Figure  48  suggests  that  in  terms  of  target  pixels 
enhancement,  the  best  results  were  obtained  for  the  sequences  J2A  and  NPA  since  the  target  trace 
there  is  stronger  relative  to  the  background.  The  M2  IF  sequence  for  example  has  more  noise  in  the 
background,  although  the  target  pixels  are  clearly  visible.  The  metric  defined  for  assessing  the  overall 
performance  of  the  algorithm,  which  was  given  in  equation  (5.4)  takes  into  consideration  not  only  the 
target  enhancement  but  the  suppression  of  the  background  abilities  of  the  algorithm  as  well.  This  is 
achieved  by  grading  each  block  by  a  score  which  evaluates  the  difference  between  the  maximal  5 
values  of  the  block  and  the  block  average  values,  normalized  by  the  standard  deviation.  The  goal  of 
coarse  is  to  have  a  target  blocks  with  high  scores  and  background  blocks  with  low  scores. 


The  purpose  of  the  final  grade  of  the  algorithm,  defined  in  equation  (5.5)  is  to  evaluate  the  separability 
between  the  target  block/s  and  the  background  blocks.  It  evaluates  the  difference  between  the  target 
block  score  (if  there  are  more  than  one  target,  the  mean  of  the  target  blocks)  and  the  mean  of  the 
background  blocks,  normalized  by  the  standard  deviation  of  the  background.  The  final  grade  obtained 
for  each  sequence  is  given  in  Table  4.  The  table  also  shows  the  grade  of  the  two  temporal  processing 
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algorithms  presented  in  section  3.2.3  -  the  variance  filter  (YF)  and  the  noise  filter  (NF)  and  previous 
temporal  processing  algorithm  from  reference  [14]  . 


Image  sequence  name 

Grade 

Ref.  [14]  Grade 

VF  Grade 

NF  Grade 

NPA 

78.417 

52.68 

2.70 

2.85 

M21F 

8.65 

10.82 

1.20 

13.89 

NA23 

13.056 

35.70 

0.54 

0.47 

Table  4:  Algorithm  performance  grade  for  each  sequence. 


The  variety  of  the  target  scores  for  the  different  sequences  can  be  understood  by  examining  the 
amplitude  of  the  maximal  target  peak,  relative  to  the  profiles  average  values. 


Sequence  name 

Target  Peak 

NPA 

~  30 

J2A 

~  140 

M21F 

~  18 

NA23 

~  60 

J13C 

~  150 

Table  5:  Target  maximal  peak  for  the  different  IR  sequences. 


The  sequences  NPA  and  NA23  are  inconsistent  with  each  other  -  although  the  target  in  NPA  seems 
weaker,  its  block  score  is  higher  than  the  one  of  NA23.  This  follows  from  the  fact  that  in  the  NA23 
scene  strong  clutter  is  present.  As  stated  in  the  following  subsection  the  optimal  window  size 
parameters  are  not  optimal  for  the  target  block,  but  are  fitted  to  the  background  as  well.  Choosing 
inappropriate  window  set  lowers  the  target  block  score. 


Therefore,  although  the  target  in  NA23  is  stronger  than  the  one  in  NPA,  the  target  block  score  received 
after  the  temporal  processing  is  lower. 


Finally  the  lowest  target  score  is  that  of  the  sequence  M2  IF.  The  target  here  is  quite  weak  and  the 
scene  is  noise  dominated.  Hence  the  low  target  amplitude  and  low  target  block  score.  The  final 
algorithm  is  the  lowest  one  as  well. 
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5.3.4  The  optimal  window  size 

Choosing  an  appropriate  set  of  parameters  for  the  temporal  algorithm  is  crucial  for  the  detection 
capabilities  of  the  system.  In  this  section  the  dependence  of  the  algorithm  on  the  window  sizes  is 
evaluated  on  real  data.  The  optimal  set  of  parameters  is  obtained  for  each  IR  sequence. 

The  expected  optimal  window  sizes  depend  on  both  the  target  temporal  profile  shape,  mainly  on  the 
target’s  peak  width  (inverse  proportional  to  the  target’s  velocity)  and  on  the  background  scene  -  the 
presence  of  clouds,  their  size  and  velocity,  as  stated  in  section  5. 1.2. 3. 


In  order  to  determine  the  optimal  set  of  window  sizes  on  a  real  data  sequence,  the  algorithm  was  run 
on  the  sequence  with  various  sets  of  parameters,  the  set  which  yielded  the  highest  algorithm  grade, 
defined  in  equation  (5.3),  was  chosen. 


Figure  49  shows  the  results  of  the  simulation  on  the  IR  sequence  NPA. 


Group  Width  10 


Group  Width  1 1 


Group  Width  12 


Group  Width  13 


10 

Overlap 


10  15 

Overlap 


Figure  49:  Algorithm  score  vs.  window  size  sets  for  the  IR  sequence  NPA. 


The  results  show  that  the  highest  algorithm  score  was  obtained  for  group  width  if  size  14  samples, 
overlap  of  size  7  samples  and  variance  window  of  size  6  samples  at  DC  window  of  size  50  samples. 
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The  final  algorithm  grade  evaluates  the  difference  between  the  target  block  score  (if  there  are  more 
than  one  targets,  the  relevant  scores  are  averaged)  and  the  mean  background  score,  normalized  by  the 
standard  deviation  of  the  background  blocks.  Thus,  the  optimal  window  set  will  tend  to  maximize  the 
target  while  minimizing  the  background  and  standard  deviation  scores. 

Table  6  summarizes  the  optimal  window  sets  for  each  IR  sequence.  The  results  of  the  algorithm 
performance  for  different  window  sizes  for  each  IR  sequence  are  given  in  the  appendix  of  this  work. 


Sequence  name 

Group  Width 

Overlap 

Var.  window 

NPA 

14 

7 

6 

M21F 

19 

18 

8 

NA23 

16 

5 

4 

Table  6:  Optimal  window  sets  for  each  IR  sequence. 
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5.4  Synthetic  hypercube  creation 

In  order  to  properly  evaluate  the  system,  a  real  hyperspectral  movie  was  needed.  Since  such  data  were 
not  available,  a  simple  procedure  for  creating  a  hyperspectral  movie  was  developed.  In  order  to  make 
the  movie  as  realistic  as  possible,  each  hypercube  from  the  sequence  is  created  from  a  real  IR  frame. 
This  efficiently  simulates  the  temporal  characteristics  of  the  evolving  background. 


The  hypercube  creation  is  performed  in  two  steps.  The  first  step  is  to  create  a  "background"  hypercube 
which  contains  cloud  and  sky  mixed  pixels.  The  second  step  is  to  implant  into  the  "background"  cube 
a  synthetic  target. 

5.4.1  The  background  hypercube 

The  hypercube  is  created  using  the  consecutive  frame  IR  sequence  presented  in  the  previous  section  as 
a  reference  for  obtaining  the  percent  of  sky  and  cloud  signature  of  each  pixel.  The  IR  frame  sequence 
containing  F  frames  is  denoted  by  IR={lR(f)}f=i  2 . f- 

First  arbitrary  signatures  representing  the  target,  cloud  and  sky  spectra  are  chosen;  these  signatures  are 
denoted  by  t,  c  and  b  respectively.  For  S  available  spectral  bands,  the  signatures  are  vectors  of  size 
Sxl.  Examples  of  the  signatures  used  in  this  work  are  given  in  Figure  50. 
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Figure  50:  Spectral  signatures  used  to  represent  the  sky,  cloud  and  target  signatures. 
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The  hypercube  at  each  time  index  /  is  constructed  by  using  the  real  infrared  frame  at  time  f  IR(f),  as 
the  reference  for  the  percentage  of  sky  and  cloud  component  for  each  pixel.  The  hypercube  created  at 
time  index /is  denoted  by  D(f).  DjjQ)  is  a  vector  of  size  Sxl  representing  the  spectral  signature  of  the 
pixel  coordinated  at  (i,j). 


(5.6) 


where  IRij(f)  refers  to  the  pixel  value  of  the  real  infrared  image  at  time  f  Vmin  is  the  minimum  pixel 
value  of  IRQ)(Vmin  =minij[IRQ)]),  Vmax  is  the  maximum  pixel  value  of  IR(f)  ( Vmax  =max$IRQ )]).  For 
example,  the  pixel  having  the  maximum  value  of  IRQ)  will  contain  100%  cloud  signature;  the  pixel 
with  the  minimum  value  of  IRQ)  will  contain  100%  sky  signature.  Pixels  with  intermediate  values  will 
contain  mixtures  of  sky  and  cloud  signatures  in  proportion  matching  the  pixel's  value  and  normalized 
to  a  fractional  value  between  0-1. 

The  hypercube  movie  is  comprised  of  F  hypercubes;  it  is  denoted  by  M={D(f)}f=iy2 . f- 

5.4.2  Target  implantation 

The  synthetic  target  is  characterized  by  the  following  parameters: 

1.  Spatial  shape,  denoted  by  s(x,y),  is  dependent  on  the  camera’s  point  spread  function  and  the  actual 
target  shape.  A  possible  choice  might  be  a  two-dimensional  gauss  function.  Figure  51  shows  an 
example  of  a  two  dimensional  target  having  a  half-period  sine  shape  in  both  x  and  y  directions. 

2.  Temporal  behavior,  where  the  velocity  is  measured  in  pixels/frame,  denoted  by  vx  and  vy. 


Example  of  sine-shaped  target 
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Figure  51:  A  two  dimensional  sine  shaped  target. 
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The  fraction  of  the  target  in  a  pixel  affected  by  it  is  given  by: 

Xq  +  A  _Pq  +A 

(5.7)  P(f)=  f  fs(x,y)dxdy 

*o  yo 

where  ({xo,x0+A},{  y0,yo+A})  denotes  the  pixel's  spatial  boundaries. 

There  are  two  methods  for  target  implantation  [15]  . 

(5.8)  y  =  pt  +  x 

(5.9)  y  =  pt  +  (\-p)x 

After  creating  the  sequence  /D (/)//=/, 2 . f  the  target  is  implanted  into  the  relevant  pixels;  equations 

(5.8)  and  (5.9)  are  modified  to: 

(5-10)  £,(/)  =  p(f)Gt  +  (l-p(f))D\j(f) 

(5-11)  Dy  (/)  =  p(f)Gt  +  D\j  (/) 

where  Dly(f )  represents  the  appropriate  mixed  pixel  from  the  background  cube  created  at  the  first 
step  and  G  is  a  normalizing  constant  factor  which  acts  as  the  target’s  base  gain. 

The  indices  of  the  pixels  affected  by  the  moving  target  change  according  to  the  target's  spatial  shape, 
velocity  and  direction.  Assuming  the  target's  start  position  is  at  coordinates  ( xo,yo ),  at  frame  /  the 
target's  position  will  be  (xo  +  fvx,  y0  +  f  vy).  Figure  52  illustrates  the  movement  of  the  target  having  a 
half-period  sine  spatial  shape,  a  width  of  2x2  pixels,  a  horizontal  velocity  of  0.5  pixel/ffame,  and  a 
vertical  velocity  of  0.25  pixel/frame. 


Figure  52:  Illustration  of  2x2  pixels  target  moving  at  Vy=0.5  pixel/frame,  ^=0.25  pixel/frame. 


5.4.3  Noise  addition 

The  noise  added  to  each  synthetic  hypercube  is  White  Gaussian  Noise.  In  order  to  keep  the  noise 
proportional  to  the  relevant  spectral  signature  magnitude,  the  noise  variance  is  dynamically  determined 
using  the  following  general  description: 
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(5.12)  cr2  =  NoiseFactor  *max(signature)2 

where  NoiseFactor  is  a  constant  with  value  between  0-1.  This  is  consistent  with  IR  imagery  where  the 
noise  power  is  proportional  to  the  signal  power  (brightness  of  the  object). 

In  addition,  noise  is  added  to  the  synthetic  target  temporal  movement  -  at  each  new  frame,  the 
effective  step  size  of  the  synthetic  target  is  calculated  using  the  following: 

StepSize  =  max(TheoreticalStepSize  +  n,  1) 

(5.13) 

n  ~  N( 0,  ( NoiseFactor  *  TheoreticalStepSize )  ) 

where  NoiseFactor  is  a  constant  between  0-1,  TheoreticalStepSize  is  a  constant,  calculated  from  the 
target  velocity,  and  n  is  the  noise  added  (or  subtracted)  which  has  zero  mean  Gaussian  Distribution 
with  standard  deviation  proportional  to  the  theoretical  step  size.  In  order  to  keep  the  motion  model 
physically  consistent,  the  step  size  must  be  a  positive  number. 


5.4.4  Examples  of  hyperspectral  movie 

Figure  54  shows  examples  of  several  bands  of  the  hypercube  with  a  synthetic  target  implanted  at  the 
upper  left  region  of  the  cube.  This  hypercube  was  created  without  the  addition  of  noise  to  the  spectral 
signatures.  The  IR  image  used  for  this  hypercube  is  shown  in  Figure  53. 
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Figure  53:  Example  of  IR  block  used  for  creation  of  hypercube. 


Page  88  of  102 


Project  Report  -  "Spatial  and  temporal  point  tracking  in  real  hyper  spectral  images" 


Band  2 


Benjamin  Aminov ,  Ofir  Nichtern,  Stanley  Rotman 

1 2040 
2020 
2000 
1980 
■  i960 


Band  40 

3300 
3200 
3100 
3000 
2900 
2800 
2700 


Band  10 


Figure  54:  Bands  2, 10,  20  and  40  of  synthetic  hypercube. 


Figure  55  shows  the  same  hypercube’s  bands,  but  with  the  addition  of  white  Gaussian  noise  to  each 

2 

spectral  signature.  The  noise  variance  is  set  to  be  [0.1  ■  max  (signature)]  (noise  factor  of  0.1). 
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Figure  55:  Bands  2, 10,  20  and  40  of  synthetic  hypercube  with  addition  of  spectral  noise. 


Figure  54  and  Figure  55  show  several  spectral  bands  of  a  hypercube  at  a  single  time  instance.  A 
spectrogram  of  single  spatial  location  (a  pixel)  of  the  hypercube  can  be  plotted  in  which  the  spectral 
variations  as  a  function  of  time  can  be  observed  for  the  given  pixel.  Figure  56  shows  a  spectrogram 
from  a  pixel  taken  from  the  hypercube  based  on  the  IR  image  given  in  Figure  53.  The  IR  image 
obviously  contains  drifting  clutter.  The  clutter  impact  on  the  temporal  spectral  changes  of  the  pixel  is 
evident  from  the  spectrogram.  The  signatures  used  for  this  hypercube  were  given  in  Figure  50.  The 
first  spectrogram  doesn’t  contain  target.  The  spectrogram  shows  higher  amplitude  values  for  spectral 
bands  35-80  with  low  values  at  bands  0-10  at  time  indices  0-40.  This  is  consistent  with  the  spectral 
signature  representing  the  cloud,  which  has  peaks  at  spectral  bands  40-80  besides  local  minima  around 
band  60.  Thus,  the  pixel  presented  was  traversed  by  a  clutter  approximately  at  times  0-60.  The  sky 
spectral  signature  appears  to  be  constant  with  the  time  at  time  indices  65-95.  Its  features  fro  bands  1- 
35  are  similar  to  the  sky  signature,  the  major  difference  is  evident  for  bands  35-95  where  the  sky 
signature  has  lower  and  more  unified  levels  of  amplitude. 


Page  90  of  102 


Project  Report  -  "Spatial  and  temporal  point  tracking  in  real  hyper  spectral  images" 


Benjamin  Aminov,  Ofir  Nichtern,  Stanley  Rotman 

The  lower  spectrogram  shows  the  same  pixel  but  with  target  implanted  into  it.  The  target  influence  can 
be  seen  at  time  indices  5-15,  causing  amplitude  increase  around  bands  20,  35  and  45.  Compared  to  the 
sky  signature,  the  target  signature  has  higher  amplitude  at  the  lower  spectral  bands.  Since  the  method 
of  target  implantation  is  additive  (and  not  by  replacing  the  relevant  portion  of  the  signature),  the 
spectrum  of  the  pixel  affected  by  both  clutter  and  target  is  actually  the  sum  of  the  two  signatures. 
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Figure  56:  Spectrogram  of  pixel  affected  by  clutter  with  and  without  target  implanted  in  it. 


Figure  57  shows  the  same  spectrogram  of  the  pixel  with  target  implanted  into  it  with  addition  of  white 
Gaussian  noise  to  each  spectral  signature.  The  noise  variance  is  set  to  be  [0.1*max (signature)]  (noise 
factor  of  0.1). 
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Figure  57:  Spectrogram  of  pixel  affected  by  clutter  and  target  with  addition  of  spectral  noise. 
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5.5  Evaluation  of  the  complete  system  on  real  data 

The  hyperspectral  movie  is  created  as  described  in  the  previous  section.  The  movie  consists  of  a 
sequence  of  30x30x96  cubes  (width  x  height  x  bands).  A  synthetic  target  is  implanted  into  the 
sequence.  The  target  is  sine  shaped  2x2  pixels  wide  and  has  a  horizontal  and  vertical  velocity  of 
0-1  \ppf\-  White  Gaussian  noise  is  added  to  each  spectral  signature;  the  noise  variance  is  set  to  be 
[0. 1  *Max  (signature)]2 . 

The  movie  is  then  input  into  our  system.  The  third  reduction  test  given  in  section  4.2  (using  the  match 
fdter  with  the  estimated  covariance  matrix)  is  applied  and  used  for  the  first  stage  processing  of  each 
hypercube,  and  the  temporal  processing  algorithm  described  in  section  4.3  for  the  target  detection  at 
the  second  stage.  The  output  of  that  stage  is  input  into  the  DPA.  At  the  end  the  last  processed  frame  is 
taken  and  the  highest  pixel  is  declared  as  'target'  and  its  track  is  found.  In  this  section,  we  will  compare 
Test  3,  and  two  new  tests,  Test  4  and  Test  5,  defined  in  the  next  sub-section,  will  be  used. 

5.5.1  Metrics  definitions 

The  metric  at  3. 2. 3. 2  defines  the  score  of  any  two-dimensional  image: 

The  block  SNR  is  given  as: 

(5.14) 

where  viyj  is  the  set  of  pixels  belonging  to  the  (i,j)th  block,  M  is  a  set  containing  the  five  pixels  with  the 
highest  gray  level  in  that  block,  and  a  is  the  standard  deviation  of  the  block  pixels.  The  formula 
performs  a  subtraction  between  the  expectation  value  of  the  highest  pixels  (target)  and  the  expectation 
value  of  the  rest  of  the  pixels  (background),  divided  by  the  standard  deviation  of  block  pixels.  Since 
the  probability  matrices  introduce  the  influence  of  target  pixels  on  adjacent  pixels,  these  influenced 
pixels  might  accumulate  higher  values  than  unaffected  pixels  (background);  we  will  regard  them  as 
target  pixels.  This  might  lower  the  expectation  value  of  the  target,  but  will  lower  the  standard 
deviation  of  the  background,  since  these  high  pixels  are  higher  than  the  statistics  of  the  background,  to 
a  more  accurate  one. 


Block  _  SNR(i,  j )  = 


£[v,..G¥]-f[v..?tf] 
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The  algorithm  score  is  given  as: 

Block  SNR  ( i ,  j\l  l>n  -  E\ {Block  SNR  ( i,  j)} 


(5.15) 


Score  =  ■ 


{Block  _SNR(ij)}{.J)=N 


Three  tests  based  on  this  metric  are  further  defined.  Test  3  uses  a  MF  for  the  cube  collapsing.  Test  4 
uses  a  MF  detector  as  the  input  of  the  temporal  processing.  Test  5  adds  to  Test  4  the  Dynamic 
Programming  algorithm.  These  tests  were  created  to  evaluate  the  effect  of  the  IR  tracking  algorithms 
on  the  overall  score  of  Hyperspectral  tracking  system. 

The  MF  and  the  temporal  processing  create  images  with  pixel  scores  according  to  their  likelihood  of 
being  a  target,  whereas  the  DPA  accumulates  the  scores  of  pixels  according  to  the  probability  of  the 
path  going  thru  them  to  be  the  target's  path. 


Test 

Score 


5.6  Discussion  of  results  obtain  on  real  data 

This  section  presents  the  results  of  applying  the  complete  system  algorithm  on  hyperspectral  movie 
based  on  blocks  from  the  real  IR  sequence  NA23.  The  algorithm  was  run  on  target  block  and  the  eight 
surrounding  background  blocks.  The  blocks  were  chosen  to  represent  different  scenes,  which  might  be 
roughly  categorized  as  ‘clear  sky’  scene  -  contains  only  sky,  ‘weak  clutter’  -  contains  partial  weak 
clutter  (with  low  to  medium  IR  amplitude)  and  ‘strong  clutter’  -  contains  partial  clutter  with  high  IR 
amplitudes.  The  parameters  of  the  simulation  are  summarized  in  Table  7. 

Table  8  -  Table  13  presents  the  results  for  three  different  background  hyper-cubes,  based  on  blocks 
from  the  real  IR  sequence  NA23,  Figure  59  shows  a  single  frame  from  the  sequence,  divided  into 
labeled  blocks. 
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Parameter 

Value 

Hyperspectral  movie  parameters 

Movie  length 

95  frames 

Spectral  signatures 

Identical  to  the  ones  presented  in  Figure  50 

Block  size 

30x30  pixels 

Number  of  spectral  bands 

100  bands 

IR  source  sequence 

NA23 

Noise  added 

WGN,  noise  factor  of  0.05  (std  =  noise  factor  *  0.05) 

Synthetic  target  properties 

Spatial  shape 

half  sine,  2x2  pixels,  integral  of  the  spatial  distribution 
normalized  to  0.5 

Horizontal  velocity 

1/8  pixels/frame 

Vertical  velocity 

1/8  pixels/frame 

Velocity  error 

as  described  in  section  5.4.3,  noise  factor  of  0.25 

Hyperspectral  cube  reduction 

Reduction  Filter 

Test  1,  Test  2,  Test  3 

Target  Block 

37  (scarce  clutter),  38  (sky  only),  39  (strong  clutter) 

Target  factor 

10,  20,  40,  60,  80,  100,  500,  1000 

Temporal  processing  parameters 

Sub  profile  length 

15  samples 

Overlap 

10  samples 

DC  window 

50  samples 

DC  step  size 

15  samples 

Variance  window 

4  samples 

DPA 

EMC  (b) 

[0...6]  for  £  =  0,  [0...1]  for  #  =  1 

a 

1 

p 

0.5 

y 

24 

Table  7:  Complete  system  simulation  parameters. 


Thp  IR  im anp  spnupnr.p  NA?3  Frpmp  1 


VHPQ 


Figure  59  :  Single  frame  of  IR  sequence  NA23  with  labeled  blocks  division. 
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Block 

Scene  Description 

Test  1  -  Spectral  Average 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

1 .7622 

1 .7542 

1.7414 

1 .7204 

38 

Mainly  sky 

-1.3432 

-1.0566 

-0.5864 

-0.0326 

39 

Mainly  strong  clutter 

-0.0362 

-0.0371 

-0.0354 

-0.0392 

mean 

0.1276 

0.2201 

0.3732 

0.5495 

Table  8:  Test  1  system  evaluation  results. 


Block 

Scene  Description 

Test  2  -  Scalar  Product 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

1 .3957 

1 .3572 

1.3213 

1.3103 

38 

Mainly  sky 

-1.2325 

-0.8918 

-0.3695 

0.2114 

39 

Mainly  strong  clutter 

-1.2276 

-1.2379 

-1.1656 

-1.0806 

mean 

-0.3548 

-0.2575 

-0.0712 

0.1470 

Table  9:  Test  2  system  evaluation  results. 


Block 

Scene  Description 

Test  3  -  MF 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

2.3023 

2.2752 

2.2497 

2.2419 

38 

Mainly  sky 

-0.2196 

0.2049 

0.7912 

1 .3592 

39 

Mainly  strong  clutter 

-0.2132 

-0.2266 

-0.1337 

-0.0263 

mean 

0.62317 

0.7512 

0.9690 

1.1916 

Table  10:  Test  3  system  evaluation  results. 


Block 

Scene  Description 

Test  4  -  MF&TP 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

2.9422 

3.6824 

3.7198 

3.7072 

38 

Mainly  sky 

3.2252 

3.6421 

3.7222 

3.7188 

39 

Mainly  strong  clutter 

0.5539 

1.6206 

3.1649 

3.4353 

mean 

2.2404 

2.9817 

3.5356 

3.6204 

Table  11:  Test  4  system  evaluation  results. 
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Block 

Scene  Description 

Test  5 -MF,  TP  &  DPA 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

3.6473  ^ 

3.8205  w 

3.8248 (3) 

3.8448  (4) 

38 

Mainly  sky 

3.7568  (b) 

3.8255  ™ 

3.831 8  w 

3.8365  m 

39 

Mainly  strong  clutter 

2.0955  (e) 

2.3476  {ii) 

3.7458  (/) 

3.7171  (B) 

mean 

3.1665 

3.3312 

3.8008 

3.7994 

Table  12:  Test  5  system  evaluaetion  results  for  g=0.  Max  Numbers=3. 


Block 

Scene  Description 

Test  5  -  MF,  TP  &  DPA 

Target  factor 

20 

40 

60 

80 

37 

Partial  weak  clutter 

3.4213  (a) 

3.7624  w 

3.8384  m 

3.8063  m 

38 

Mainly  sky 

3.6276  <10> 

3.8049  (11> 

3.8492  m 

3.8350  w 

39 

Mainly  strong  clutter 

0.8168  (B) 

0.2714  tb> 

3.4032  (12» 

3.2192  m 

mean 

2.6219 

2.6128 

3.6969 

3.6201 

Table  13:  Test  5  system  evaluaetion  results  for  g=  1,  Max  Numbers=\. 


(I)  -  best  tracking  occurs  for  b>7 
(3)  -  best  tracking  occurs  for  b>3.4 
(5)  -  target  is  tracked  for  b>  1 .4. 

(7)  -  target  is  tracked  for  0<b<  1 

(9)  -  Tracking  occurs  for  0<6<0.3,  6=0.6 

(II) -  Tracking  occurs  for  0<6<0.9 


(2)  -  tracking  occurs  for  all  6's 
(4)  -  best  tracks  occur  for  4<6<5.4 
(6)  -  no  tracking  occurs. 

(8)  -  tracking  occurs  for:  (1)  0<6<4.2,  (2)  5.4<6<5.6,  (3)  6<6 
(10)  -  Tracking  occurs  for  0.6 <6<  1 
(12)  -  Tracking  occurs  for  0<6<0.5 


Previous  research,  [14]  ,  has  shown  that  Test  1  allows  a  rough  assessment  of  the  target  pixels  relative 
amplitude  compared  to  their  background.  The  low  values  of  the  results  indicated  that  the  implantation 
of  the  target  and  taking  the  maximal  score  without  any  processing  is  not  sufficient  for  detection;  in 
other  words,  the  implantation  method  does  not  allow  “easy”  detection.  The  highest  values  of  Test  1 
were  in  the  weak  clutter  scenes  which  is  reasonable  since  the  implantation  method  is  additive,  and,  in 
weak  clutter  surroundings,  the  amplitude  levels  are  obviously  higher  than  clear  sky  scenes. 
Comparison  between  Test  1  and  Test  2  allowed  us  to  estimate  the  improvement  of  using  a  primitive 
hyperspectral  processing  -  simply  taking  the  average  of  all  the  bands.  Although  this  brought 
improvement  in  sky  or  weak  clutter  scenes,  it  had  negative  impact  on  strong  clutter  scenes  which 
proved  that  simply  averaging  the  bands  is  disastrous  for  certain  sets  of  spectral  signatures  and  cannot 
be  used  as  a  detection  method  by  itself.  Thus  the  focus  of  the  discussion  should  be  the  use  of  “smart” 
hyperspectral  processing  only  (Test  3),  the  use  of  hyperspectral  processing  and  temporal  processing 
(Test  4)  the  use  of  hyperspectral  processing,  temporal  processing  and  DPA  (Test  5). The  results  in  [14] 
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have  shown  that  in  all  of  the  cases,  there  is  an  obvious  advantage  to  using  both  hyperspectral  detection 
(Matched  Filter)  and  temporal  processing  (Test  4  vs.  Tests  1-3). 

When  the  target  is  implanted  in  clear  sky  scenes,  the  use  of  temporal  processing  significantly  improves 
the  performance  compared  to  use  of  hyperspectral  detection  only.  In  most  cases  the  use  of  the  Matched 
Filter  compared  to  simple  averaging  was  clearly  advantageous;  the  exception  being  block  3 1  for  which 
the  performance  was  similar  for  both  of  the  techniques.  This  can  be  attributed  to  the  relative  “easiness” 
of  detection  in  this  kind  of  scene  and  the  fact  that  the  high  level  of  noise  might  cause  a  disadvantage  to 
the  Matched  Filter  and  an  advantage  to  the  averaging  filter.  When  weak  clutter  was  present,  the 
temporal  processing  combined  with  Matched  Filter  detector  was  always  better  than  temporal 
processing  only  which  in  turn  is  better  to  hyperspectral  detection  only. 

Preliminary  runs  have  been  done  for  target  factor  1000,  500,  100  and  10.  A  valid  range  of  target  factor 
was  needed  to  be  found  in  order  to  define  the  boundaries  of  the  full  system.  The  results  have  shown 
that  target  factor  of  100  or  above  are  'too  easy'  to  use  a  detection  algorithm,  whereas  target  factor  of  10 
is  'too  hard'  for  the  full  system  to  detect  and  track  a  target.  Since  that  is  the  case,  tryouts  of  the 
following  values  of  target  factor  have  been  applied:  20  -  80  at  steps  of  20. 


(a)  (b) 

Figure  60:  Block  38,  Target  Factor  100:  tal  Single  Frame  after  MF  (Test  3),  (b)  Single  frame  after  MF  and  TP  (Test 

4}- 
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(a) 


(b) 


(c) 

Figure  61:  Block  38,  Target  Factor  20:  (a)  Single  Frame  after  MF  (Test  3  -  Frame  643),  (b)  Single  frame  after  MF 

and  TP  (Test  4  -  Frame  12),  (c)  Single  frame  after  MF,  TP  and  DPA  (Test  5  -  Frame  12). 


3  As  described  at  the  Table  7,  "sub  profile  length  is  equal  to  15,  overlap  is  equal  to  10"  have  been  chosen:  taking  frame  12 
after  the  TF  is  equivalent  to  taking  a  frame  between  55-70  in  the  sequence  after  MF. 
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6  Summary  and  Conclusions 

This  report  consists  of  the  culmination  of  three  years  of  work.  It  is  interesting  to  consider  the  steps 
which  have  been  taken  to  produce  these  results.  If  we  look  at  the  System  Diagram  in  Figure  6,  we  find 
that  the  following  work  has  been  performed: 


1 .  The  construction  of  a  database  of  hyperspectral  movies.  The  fundamental  broadband  moview 
were  provided  by  the  Air  Force  Research  Laboratory;  we  designed  a  way  to  convert  them  to 
hyperspectral  movies. 

2.  The  matched  filter  stage  was  taken  from  the  standard  literature.  Future  work  will  involve  using 
somewhat  more  sophisticated  algorithms  to  be  used  to  process  the  spectral  domain. 

3.  The  temporal  stage  was  designed  and  tested  on  the  AFRL  movies.  This  work  has  been 
documented  extensively  in  Ref.  [14]  . 

4.  The  Dynamic  Programming  Algorithm  has  been  built  and  tested. 

5.  An  alternative  approach  for  the  temporal  and  DP  A  stages  has  been  considered;  this  involved 
the  Kalman  Filter  and  will  be  documented  in  a  separate  report  (see  appendix). 

6.  Metrics  were  developed  to  test  the  effect  of  each  stage  on  our  target  acquisition  capability. 


We  overall  conclude  that  each  stage  enhanced  our  target  acquisition  capability. 


One  important  note  concerning  this  work  is  that  it  is  modular.  The  modules  for  spectral  processing, 
temporal  processing,  DPA  processing  and  even  our  method  of  evaluating  algorithms  can  be 
individually  replaced  and  tested.  Our  work  can  thus  easily  be  extended  to  new  developments. 


It  has  been  a  pleasure  to  work  on  this  project  for  the  last  three  years;  we  thank  AFOSR  for  the 
opportunity. 
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Appendix 

A  Practical  Method  of  Tracking  a  Single 
Target  in  Cluttered  Hyperspectral  Data 


Y.  Simson  and  S.R.  Rotman 
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1  Introduction 


Tracking  a  dim  target  in  clutter  is  a  well  known  problem  in  the  scientific 
literature.  There  are  two  main  approaches.  The  first  is  to  initially  threshold 
every  image  and  then  make  a  decision  as  to  which  sequence  of  measurements 
is  the  most  likely  track.  Among  the  various  methods  in  this  category  are  the 
IMM  (Interacting  Multiple  Models),  GPBF  (Generalized  Pseudo  Bayesian 
Filter  )  and  MHT  (Multiple  Hypothesis  Testing).  The  second  approach 
is  to  delay  the  thresholding  until  the  end  of  the  tracking  process.  This  is 
usually  is  carried  out  by  using  the  DPA  (Dynamic  Programming  Algorithm). 
Both  approaches  have  advantages  and  disadvantages.  The  first  approach 
(IMM, MHT)  is  better  suited  to  real  time  applications  whereas  the  second 
approach  (DPA)  is  considered  better  at  tracking  targets  with  a  low  SNR.  In 
this  paper  the  former  approach  is  taken;  specifically  we  implement  the  IMM 
algorithm. 

When  using  the  IMM  algorithm  there  always  is  a  chance  of  false  alarms. 
This  makes  the  problem  a  lot  more  complex.  Not  only  is  there  a  challenge  of 
estimating  the  real  position  and  speed  but  in  addition  one  has  to  decide:  out 
of  a  number  of  possible  targets,  which  is  the  real  target  or  whether  a  target 
has  even  been  detected  at  all.  There  are  two  different  types  of  clutter.  The 
first  is  residual  clutter .  The  second  is  termed  persistent  clutter .  The  two 
basic  assumptions  about  residual  clutter  are  that  the  clutter  or  false  mea¬ 
surements  are  independent  across  time  and  uniformly  spatially  distributed. 
With  these  assumptions  there  are  ways  to  model  the  residual  clutter.  The 
persistent  clutter  can  either  be  canceled  by  pre-processing  or  by  modeling 
it  and  tracking  it.  For  example  false  tracks  generated  by  edges  between 
different  regions  in  an  image.  If  the  speed  of  the  background  is  known,  then 
these  false  tracks  will  have  the  same  speed  and  direction,  therefore  can  be 
filtered  out.  The  approach  adopted  here  for  dealing  with  the  residual  clutter 
is  that  of  Y.  Bar  Shalom  in  [7]  and  [5]  namely  the  PDAF  (Probabilistic  Data 
Association  Filter)  algorithm.  When  combined  with  the  IMM  the  resulting 
algorithm  is  termed  IMM-PDAF.  This  problem  and  the  problem  of  initiat¬ 
ing  the  track  sequence  will  be  dealt  with  after  treating  the  simpler  problem 
of  a  single  target  in  residual  clutter. 

2  Tracking  Overview 

Both  tracking  methods  (DPA/IMM-PDAF)  have  the  some  main  steps  in 
common.  However  the  steps  are  carried  out  in  a  different  order.  For  the  first 
approach  the  IMM-PDAF  algorithm  was  chosen  for  its  relative  simplicity  yet 
good  results  in  comparison  to  the  GPBF  and  the  MHT.  The  principle  steps 
are: 

•  Pre-processing:  Usually  done  by  linear  filtering,  ordered  statistics  fil- 
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ters. 


•  Thresholding  the  image. 

•  IMM-PDAF :  The  algorithm  will  be  explained  later  in  detail. 

•  Thresholding  the  surviving  tracks. 

For  the  DPA  the  basic  steps  are: 

•  Pre-processing:  Usually  done  by  linear  filtering  or  ordered  statistics 
filters. 

•  DPA:  The  Viterbi  algorithm. 

•  Thresholding  the  results. 

2.1  Pre-processing 

In  this  research  two  methods  for  pre-processing  are  considered;  linear  and 
ordered  statistics  filters.  The  assumption  behind  both  approaches  is  that 
the  target  is  added  in  the  following  manner  to  the  original  picture: 

u(m ,  n)'  —  u(m ,  n)  + 1  (1) 

where  7/(777,  n)  is  the  original  pixel  at  coordinates  (m,  n).  The  modified  pixel 
is  7/(777,77)',  which  is  due  to  the  added  target  t.  The  purpose  of  the  pre¬ 
processing  stage  is  to  extract  pixels  with  the  added  target  t.  Ideally  if  the 
intensity  of  the  original  pixel  is  known  then  there  is  no  problem.  Simply 
subtract  7/(777,77)  from  7/(777,77)':  then  one  gets  t  if  there  is  a  target  and  0  if 
there  is  no  target.  In  real  problems  there  is  only  an  estimate  of  the  original 
pixel  7/(777,  n)  which  can  be  modeled  as  the  original  pixel  with  some  noise 

7/(777,  n)  —  u(m ,  n)  +  77(777 ,  n)  (2) 

One  usually  assumes  that  the  added  noise  is  spatially  uncorrelated.  When 
a  target  is  present  one  gets  a  whitened  image  with  the  target  in  the  relevant 
pixels 

7/(777 ,  n)r  —  u(m ,  n)  =  t  +  77(771,  n)  (3) 

The  name  of  the  game  is  to  find  an  estimate  of  the  background  7/(771,  n) 
such  that  the  variance  of  the  estimation  noise  77(771,77)  is  minimized. 

2.1.1  Linear  filtering 

Estimating  the  background  pixel  with  linear  filtering  is  the  classic  approach 
and  is  the  most  efficient.  The  estimate  is  the  result  of  the  average  of  the  8 
surrounding  pixels  from  a  3  x  3  frame  or  the  16  outer  pixels  from  a  5  x  5 
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frame.  The  resulting  estimation  noise  77(771,  n)  is  gaussian  distributed  and 
usually  has  a  zero  mean.  The  disadvantage  to  this  method  is  that  there  are 
long  tails  that  result  from  areas  in  the  picture  with  a  high  local  variance. 
The  tails  are  the  main  cause  of  false  alarms. 


2.1.2  Ordered  Statistics  Filtering 


One  of  the  best  ways  of  eliminating  the  false  alarms  on  the  edges  is  to  es¬ 
timate  the  background  by  taking  the  maximum  value  out  of  the  8  or  16 
surrounding  pixels.  A  second  option  is  to  use  the  median;  however,  the 
maximum  is  better  at  reducing  the  false  alarms.  The  results  are  shown 
for  a  section  of  the  first  frame  of  the  ’npa’  picture.  The  distribution  for 
median  filtering  is  normal  while  the  distribution  for  the  Maximum  filter  is 
Extreme  Value  distributed.  The  Extreme  Value  Distribution  typically  oc¬ 
curs  when  choosing  the  maximum  value  out  of  a  set  of  Gaussian  distributed 
random  variables.  The  probability  density  function  for  the  Extreme  Value 
distribution  is: 


f(x\n,a)  =  -  exp 

C 7 


X  ~  fl 

a 


exp  <  —  exp 


x  —  fl 
a 


(4) 


2.2  Normalizing  the  Noise 


After  choosing  between  a  variety  of  basic  methods  for  whitening  the  noise 
77(777, 77), we  must  deal  with  the  fact  that  the  remaining  noise  isn’t  really 
white.  The  main  problem  being  the  image  is  non-stationary  and  is  com¬ 
prised  of  different  types  of  background.  The  typical  example  in  our  data 
sets  is  a  target  with  both  clouds  and  sky  in  the  background.  The  sky  and 
cloud  backgrounds  have  different  variances.  Since,  the  cloud  has  the  highest 
variance  the  false  alarms  and  tracks  usually  are  in  the  cloud.  This  can  be 
addressed  by  normalizing  the  pixel  by  their  local  standard  deviation.  The 
problem  with  this  approach  is  that  the  estimate  of  the  local  deviation  can 
be  zero.  Division  by  zero  or  a  small  number  will  invariably  be  the  cause  of 
undesirable  false  alarms.  A  solution  to  this  problem  has  been  suggested  by 
Raviv  and  Rotman  in  [1].  The  idea  is  to  divide  the  pixel  by  the  standard 
deviation  and  at  the  same  time  avoid  division  by  zero.  This  can  happen 
when  the  local  standard  deviation  is  zero.  One  simply  divides  the  pixel  by 
the  standard  deviation  with  an  added  constant  in  the  following  manner 


I (777, 77) 


7/(777,  n)r  —  7/(777, 77) 
sdv{m ,  77)  +  c 


(5) 


where  sdv(m,n )  is  the  local  standard  deviation  at  pixel  7/(777,77)  calculated 
from  the  16  or  8  closest  pixels.  The  easiest  way  to  choose  the  constant  c  is 
to  take  the  pixel  with  the  maximum  probability. 


c  =  m&x{Psdv(x)} 


(6) 
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A  section  of  the  npa  movie  (First  Frame) 
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A  section  of  the  npa  movie  whitened  with  Linear  filtering  (First  Frame) 
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(a) 


(b) 


A  section  of  the  npa  movie  whitened  with  Median  filtering  (First  Frame) 
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A  section  of  the  npa  movie  whitened  with  Maximum  filtering  (First  Frame) 
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(d) 


Figure  1:  Pre-Processed  images  before  thresholding:  (a)  The  original  image, 
(b)  The  image  whitened  with  a  Linear  filter,  (c)  The  image  whitened  with  a 
Median  filter,  (d)  The  image  whitened  with  a  Maximum  filter.  The  encircled 
pixel  is  the  target. 
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Histogram  and  estimated  distribution  for  a  whitening  Linear  Filter 


Histogram  and  estimated  distribution  for  a  whitening  Median  Filter 


(a)  (b) 


(c) 

Figure  2:  The  resulting  distributions  of  the  whitened  images:  (a)  The  image 
whitened  with  a  Linear  filter,  (b)  The  image  whitened  with  a  Median  filter, 
(c)  The  image  whitened  with  a  Maximum  filter. 
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where  Psdv(x)  is  the  PDF  of  the  standard  deviations.  The  constant  c  is 
calculated  individually  for  each  frame. 

The  local  standard  deviation  of  the  first  frame  of  a  movie  can  be  seen  in 
fig.  3(a).  One  can  see  from  fig.  3(b)  that  it  is  a  simple  matter  to  choose  the 
constant  c.  From  figures  3(c)  and  3(d)  we  can  see  that  the  normalization 
reduces  the  variance  in  the  area  of  the  clouds,  while  at  the  same  time  it 
emphasizes  the  target  (the  white  pixel  close  to  the  center  of  the  picture). 
The  alternative  to  normalization  with  the  standard  deviation  is  to  use  a 
spatially  adaptive  threshold  as  explained  in  the  following  section. 


10  20  30  40  50  60  10  20  30  40  50  60 


(c)  (d) 

Figure  3:  The  image  of  the  local  standard  deviations:  (b)  The  original  image, 
(b)  The  histogram  of  image  (a),(c)  The  first  frame  after  having  the  mean 
removed,  (d)  The  first  frame  after  having  the  mean  removed  and  normalized 
by  the  local  standard  deviation. 


2.3  Thresholding  the  Image 

The  most  basic  way  of  trying  to  separate  between  the  real  targets  and  the 
false  alarms  is  to  set  a  global  threshold.  Ordinarily  one  chooses  a  threshold 
such  that  it  results  in  an  acceptable  level  of  false  alarms.  Thus  the  desired 
threshold  th  is  extracted  from  the  following  equation 

poo 

Pfa  =  Po  0)  da  (7) 

J  th 

where  Pfa  is  the  level  of  false  alarms  and  po  (a)  is  the  PDF  (Probability 
Density  Function)  for  77(772,  n)  when  there  is  no  target  present. 

There  are  ways  of  improving  on  this. 

•  A  Spatially  Dynamic  Threshold 

•  A  time  Dynamic  Threshold 

2.3.1  A  Spatially  dynamic  threshold 

In  a  non-stationary  image,  a  global  threshold  is  not  a  very  good  idea.  It 
would  be  better  to  segment  the  image  and  set  a  different  threshold  for  each 
segment.  Another  possibility  would  be  to  use  a  threshold  that  is  a  function 
of  the  local  variance  calculated  from  the  surrounding  8  or  16  neighboring 
pixels.  Thus 

th  —  a  +  /3  •  ct(t72,  n)  +  5  •  ct2(tt2,  n)  (8) 

The  challenge  in  this  case  is  to  find  out  whether  one  can  make  a  significant 
reduction  in  the  ratio  of  false  alarms  to  detections  with  this  approach.  An¬ 
other  even  greater  challenge  is  to  globally  estimate  the  set  of  parameters 
(<a,/?,  5).  A  successful  way  to  estimate  these  parameters  online  has  yet  to 
found  and  should  be  a  topic  for  further  research. 

2.3.2  A  Time  Dynamic  Threshold 

Setting  the  threshold  in  a  correct  manner  is  vital  for  the  success  of  the 
tracking  algorithm.  This  goes  both  for  the  initial  stage  and  during  the 
tracking  process.  The  first  issue  is  how  to  determine  the  right  threshold  in 
the  beginning.  Currently  the  threshold  is  set  by  setting  Pfa  the  acceptable 
false  alarm  rate,  and  determining  a  threshold  that  will  satisfies  equation 
(7).  If  the  resulting  signal  lands  below  the  average  signal  amplitude  then 
the  threshold  is  set  slightly  below  the  signal  average.  This  comes  at  the  cost 
of  more  false  alarms  but  it  is  necessary  because  otherwise  the  targets  will  all 
be  below  the  threshold.  Some  ad  hoc  methods  for  changing  the  threshold 
during  the  tracking  process  are  discussed  in  section  4.5.4. 


9 


2.4  The  Tracking  Stage 

After  thresholding  comes  the  tracking  stage.  We  chose  the  IMM-PDAF 
method  for  it’s  numerical  efficiency  on  one  hand  and  good  performance  on 
the  other  hand.  The  PDAF  (Probability  Data  Association  Filter)  is  designed 
to  take  care  of  random  clutter.  The  idea  is  to  associate  a  set  of  measurements 
to  a  given  track  with  prior  knowledge  of  the  targets  behavior.  When  one 
doesn’t  have  exact  prior  knowledge  of  the  targets  behavior  the  situation 
calls  for  the  multiple  model  approach.  If  the  target  switches  between  a 
few  different  models,  then  the  IMM  (Interactive  Multiple  Model)  algorithm 
has  been  widely  shown  to  provide  good  results  by  Blackman.  S  in  [2]  and 
Bar-Shalom  [7]. 

In  a  problem  with  a  high  SNR  or  a  high  ratio  of  clutter  to  the  real 
target  the  single  target  approach  might  not  be  good  enough.  It  is  necessary 
to  maintain  a  Track  Before  Detect  Algorithm  and  keep  tracking  a  number 
of  high  probability  tracks  throughout  the  tracking  process.  This  means 
that  ones  simultaneously  tracks  a  number  of  possible  targets  at  the  same 
time.  When  tracking  multiple  targets  an  additional  complication  arises,  i.e. 
tracks  cross  and  the  validation  regions  overlap.  Then,  it  is  possible  to  have 
observations  that  could  belong  to  either  track.  When  that  happens,  one  has 
to  make  an  optimal  decision  regarding  the  assignment  of  the  observations 
to  the  tracks.  I  use  a  simplified  method  in  which  the  track  with  the  highest 
probability  takes  the  measurements  in  its  validation  region.  The  methods 
to  assign  a  probability  to  the  track  and  set  up  the  validation  region  are 
discussed  in  the  following  sections.  Better  but  more  complex  approaches 
are  the  JPDAF  (Joint  PDAF)  and  MHT  (Multiple  Hypothesis  Tracking)  as 
described  in  [2]  and  [7]. 

I  will  give  a  brief  outline  of  the  tracking  algorithm  which  is  explained  in 
detail  in  the  following  sections.  There  are  3  different  stages: 

Track  Initiation  At  first  pairs  are  created  from  the  first  two  scans.  Then 
one  sets  a  gate  around  the  next  predicted  detection.  Either  there  are 
detections  in  the  following  gate  and  the  track  survives  or  it’s  prob¬ 
ability  drops  due  to  no  detections  and  it  is  killed.  The  IMM-PDAF 
algorithm  is  run  for  each  and  every  pair.  Measurements  that  are  not 
associated  with  any  tracks  are  used  to  set  up  new  tracks  at  every  scan. 

Track  maintenance  In  the  case  of  high  SNR  and  a  single  target,  the  best 
track  is  chosen  from  the  track  initiation  stage  and  the  rest  of  the  other 
tracks  are  dropped.  If  the  target’s  SNR  is  low  then  it  will  be  difficult 
to  differentiate  between  the  clutter  and  the  target.  Hence  it  will  be 
necessary  to  track  many  possible  tracks  before  making  a  decision  to 
keep  or  drop  them.  In  the  low  SNR  case,  it  may  be  necessary  to  use 
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the  multi-target  approach  to  help  cope  with  this  problem  even  in  the 
case  of  a  single  target. 

Track  Termination  and  Re-Initialization  When  tracking  either  a  sin¬ 
gle  target  or  even  multi-targets,  sometimes  the  target  can  be  lost  and 
the  track  probability  drops  below  a  pre-determined  acceptable  value. 
Then  the  low  probability  track  is  dropped.  In  the  single  target  case  it 
will  be  necessary  to  re-initialize  the  whole  process.  In  the  multi-target 
case  this  won’t  be  necessary  as  associated  measurements  can  be  used 
to  start  new  tracks.  In  the  multi-target  case,  some  times,  all  existing 
tracks  are  killed,  and,  in  this  case,  it  is  necessary  to  drop  the  threshold 
and  create  new  tracks. 

The  IMM-PDAF  algorithm  can  be  broken  down  into  the  following  steps 
which  are  carried  out  in  an  iterative  manner. 

•  The  mixing  stage  of  the  IMM 

•  The  filtering  stage  of  the  IMM-  application  of  the  PDAFAI: 

—  Use  the  prediction  equations  of  the  kalman  filter  for  each  mode. 
Calculate  the  gate  size 

—  Scan  Image 

—  Pre-process  the  image 

—  Threshold  the  image 

—  Validate  the  measurement 

—  Calculate  association  probabilities  (3 ^  for  measurement  i  and  for 
mode  j  at  time  k. 

—  Update  the  measurement 

•  The  Combination  stage  of  the  IMM 

2.5  The  Post  Tracking  Stage 

A  probability  value  is  assigned  to  each  track.  The  track  with  the  highest 
score  is  the  real  track.  Another  important  criterion  is  the  length  of  the  track 
life.  If  the  algorithm  locks  to  the  target  for  less  than  50%  of  the  time  then, 
we  suggest  that  the  track  should  be  ignored.  True  tracks  tend  to  have  longer 
lifetimes  than  tracks  that  are  generated  by  clutter. 
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3  Tracking  a  Single  Target  with  Background  Clut¬ 
ter 

3.1  Validation  of  Measurements 

In  practice  we  do  not  consider  every  possible  target  in  the  image  or  from  the 
sensor.  In  order  to  reduce  the  complexity  of  the  problem  we  focus  only  on 
areas  where  it  is  highly  probable  to  find  the  target.  The  validation  procedure 
suggested  by  [5]  is  to  only  consider  observations  that  fall  in  the  vicinity  of  the 
prior  estimate  of  the  next  measurement  zk\k-i  =  E[zk\Z\.  The  cumulative 
set  of  the  real  measurements  up  to  time  k  is 

Zk  =  {zi,z2, . . .  ,zk}  (9) 

The  next  observation  zk+\  conditioned  on  Zk  is  assumed  to  be  gaussian 
distributed 

zk+i\k  ~  ^fc+i)  (1°) 

where  the  associated  covariance  matrix  Sk+i  is  defined  as 

$k+l  —  E  ^  [**+l  —  ^fc+l|fc]  [^fc+1  —  ^fc+l|fc]  \Zk  j" 

=  Hk+{Lk+1\kHl+1  +  Rk+i  (11) 

and  the  elliptical  validation  region  at  time  k  +  1  is  defined  as 

14+1(7)  =  {z  ■  [zfe+1  -  4+i| k]T  Sk~\  [zk+ 1  -  4+i|fc]  <  7}  (12) 

The  quadratic  expression  from  (12)  is  chi-square  distributed  with  nz  degrees 
of  freedom.  Alternatively  expression  (12)  can  be  written  as 

14+1(7)  =  {z  :  vk+ 1  (7^+1)  _1  Vk+1  <1}  (13) 

where  the  innovation  vk  is  defined  as 


k'k  zk  zk\k— 1 


This  defines  a  hyper-ellipsoid  region  of  the  dimension  nz.  As  an  illustra¬ 
tion,  let’s  take  a  look  at  the  2-Dimensional  case.  The  area  of  a  unit  circle 
is  7 r,  a  and  b  are  the  lengths  of  the  axes.  This  leads  to  the  next  important 
fact:  the  volume  of  the  hyper-ellipsoid  region.  This  is  defined  by  the  size  of 
unit-hypersphere  multiplied  by  the  product  of  the  lengths  of  the  semi-axis 
of  the  hyper-ellipsoid.  In  the  example  of  the  2-Dimensional  ellipsoid  the 
lengths  of  the  semi-axis  are  a  and  b.  The  “volume”  of  a  2-Dim  sphere  is 
7 to?  .  The  area  of  an  ellipsoid  is  nab. 
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The  volume  of  a  hyper-sphere  of  dimension  n  with  a  radius  r  is 

-  n 

7T  2  r 

Cre  =  F(fTI) 


where  T(n)  is  the  gamma  function  defined  as 


n  >  0 


The  product  of  the  lengths  of  the  semi-axis  is  obtained  by  the  product  of 
the  square  root  of  the  eigenvalues  of  the  matrix.  The  matrix  in  this  case  is 
ySfc+i  and  the  squared  root  of  the  determinant  is  equivalent  to  the  product 
of  the  lengths  of  the  semi-axis.  This  is  because  the  determinant  is  equal  to 
the  product  of  the  eigenvalues. 


To  summarize,  the  volume  of  the  Validation  Region  is 
Vk+ 1  =  Cnz  |7-Sfc+1|1/2 

=  cnz  Yi^nz  '  4+1 1  ^ 

=  °nz  It4J1/2  •  l4+i|1/2 

=  cnzln*/2-\Sk+l\l/2 

=  cnzgnz  •  l^+il1/2  (14) 

where  The  probability  of  the  real  target  falling  in  the  gate  or  the 

validation  region  is  defined  as 

Pg  —  P  j^/c+i  £  Vfe+i}  (15) 

Calculating  this  value  as  a  function  of  7  is  difficult  enough  but  a  simple 
manipulation  can  make  things  easier.  First  we  define  a  linear  transform  of 

Zk 


4  =  A k  1/2Uk  (zk  -  4|fc_i)  (16) 

where  is  the  eigenvalue  matrix  of  S&  and  U \  is  the  right  eigenvector  of 
Sk •  Then  we  get  z k  which  is  gaussian  distributed  as 

4-V(0,4J  (17) 

This  enables  us  to  redefine  (12)  as 

U+i(7)  =  {^:4+i4+i<7} 

=  {5  :  II4+1II2  <  7}  (18) 
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7  =  92 
g: 

1 

1 

4 

2 

9 

3 

16 

4 

25 

5 

6.6 

2.57 

9.2 

3.03 

11.4 

3.38 

nz  =  l 

.683 

.954 

.997 

.99994 

1.0 

.99 

nz  =  2 

.393 

.865 

.989 

.9997 

1.0 

.99 

CO 

II 

£ 

.199 

.739 

.971 

.9989 

.99998 

.99 

Table  1:  Gating  thresholds  and  values  of  probability  mass  in  gate 


Instead  of  having  to  calculate 

Pg{  7)  =  /  .  fzk+1\zkiak+i)dak+i 

J  {zk+l£Vk+l} 

one  calculates 

Pg{  7)=  /  .  fzk+1\zk(ak+i)dak+1 

d  {zk+i^Vk+i} 

where  fZk+1\zk(^k+i)  is  the  PDF  from  (10)  and  hk+1\zk(^k+i)  the  PDF 
from  (17). 

We  show  how  to  explicitly  calculate  these  values  in  Appendix  A.  The 
above  Table  1  of  gating  thresholds  can  be  found  in  references  [7]  and  [5]. 

The  set  of  validated  measurements  at  time  k  is  defined  as 

m  =  k'E  (is) 

where  m &  is  the  number  of  valid  measurements  at  time  k.  The  cumulative 
set  of  measurements  is  up  to  time  k  is  Z^.  The  size  of  the  gate  or  validation 
region  is  determined  by  the  parameter  7.  This  should  vary  in  time  according 
to  the  amount  of  clutter  and  various  other  criteria. 

3.2  The  Nearest-Neighbor  Standard  Filter 

In  the  nearest-neighbor  standard  filter  the  closest  measurement  to  the  prior 
estimated  measurement  Zk\k-i  is  chosen.  The  distance  that  we  use  in  order 
to  decide  which  measurement  is  closest  in  probability  to  the  prior  estimated 
measurement  is 

d2i(k )  =  [■ zk  -  4| fc-i]TSfc  1  I V  -  4|fc-i]  (20) 

This  is  also  referred  to  as  the  normalized  innovation  squared  (NIS).  The 
measurement  that  we  choose  is  the  one  that  brings  expression  (20)  to  a 
minimum.  The  disadvantage  to  this  method  is  that  we  don’t  take  into 
account  previous  possible  measurements.  Another  disadvantage  is  that  it 
doesn’t  take  into  account  the  possibility  that  none  of  the  measurements  in 
the  given  set  of  measurements  is  a  target. 
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3.3  Clutter  Model 

3.3.1  The  Parametric  model 

In  the  case  of  tracking  a  target  in  an  image,  let’s  say  that  there  are  N  pixels. 
In  addition  we  shall  assume  that 

•  The  detection  events  are  independent  of  each  other. 

•  The  probability  of  such  a  detection  being  a  false  alarm  is  Pfa  in  each 
cell. 


Then  the  probability  of  the  number  of  false  alarms  is  Bernoulli  distributed 
and  given  by 

P{nFA  =  n}=  (  "  (21) 

Assign  the  volume  of  the  N  cells  under  inspection  the  value  V  and  call  the 
spatial  density  of  the  false  alarms 


_  E[nFA]  _  Np 


(22) 


If  P  <  1  and  N  is  large  enough  then  the  Bernoulli  distribution  can  be 
approximated  by  a  Poisson  distribution  and  thereby  we  obtain 


pF(m) 


_  e-NP  (Np) 
ml 


(23) 


By  using  the  spatial  density  ratio  from  (22),  (23)  now  becomes 


Mm)  =  e--(A Vp 

ml 

This  method  requires  knowledge  of  the  parameter  A.  In  the  next  subsection 
a  non-parametric  model  will  be  discussed  that  doesn’t  require  knowledge  of 
the  parameter. 


3.3.2  The  non-parametric  Model 

In  the  non-parametric  model  we  assume  that  the  PMF  is  uniformly  distrib¬ 
uted. 

=  Jd’  mk  =  0,l,...,N  -1  (25) 

where  M  is  the  number  of  false  alarms.  We  will  see  in  section  3.4.2  that  the 
value  of  M  is  not  important. 
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3.4  The  Probabilistic  Data  Association  Filter 
3.4.1  The  Assumptions  of  the  PDAF 

With  this  approach,  we  take  into  account  the  possibility  that  none  of  the 
valid  measurements  are  due  to  the  target.  We  note  that  the  previous  valid 
measurement  are  not  taken  into  account.  (An  approach  that  takes  these 
measurements  into  account  is  referred  to  as  the  The  Optimal  Bayesian  Ap¬ 
proach).  This  approach  will  be  discussed  in  a  future  section.  The  disadvan¬ 
tage  with  this  approach  is  that  the  computational  burden  grows  with  time 
and  can  saturate  even  the  most  powerful  computer  systems.  This  method 
however  might  be  useful  for  the  initialization  sequence. 

The  Probabilistic  Data  Association  Filter  (PDAF)  is  based  on  three  as¬ 
sumptions: 

•  The  track  has  been  initialized 

•  There  is  only  one  target 

•  The  prior  state  estimate  is  Gaussian  distributed 

%k\k— 1  ^  J\f  (%k\k—l  5  ^k\k—l)  (^6) 

•  At  each  stage  before  the  new  set  of  measurements  arrives,  a  validation 
region  is  set  up,  as  in  (12) 

•  Out  of  the  valid  measurements  there  is  either  one  valid  measurement 
or  none  at  all 

•  The  false  measurements  are  uniformly  distributed 
Under  these  assumptions  the  events 

q  i  _  f  {zk  the  target  originated  measurement}  ,  i  —  1, . . . ,  m & 

k  \  {none  of  the  measurements  is  target  originated},  i  —  0 

(27) 

are  mutually  exclusive  and  exhaustive.  Using  the  theorem  of  total  probabil¬ 
ity  the  conditional  mean  of  the  state  can  be  written  as 

% k\k  m\Xk\Zk\ 

mk 

=  ^Eixkie^Zkipie^Zk} 

i= 0 
mk 

=  I>4t/V  (28) 

i= 0 
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where  x^k  is  the  updated  state  conditioned  on  the  event  that  the  zth  vali¬ 
dated  measurement  is  correct  and 

^tp{ekl\Zk}  (29) 

is  the  probability  that  the  zth  measurement  is  the  target  for  i  =  1, . . . ,  ra&. 
For  i  —  0  Equation  (29)  represents  the  probability  that  there  is  no  valid 
target  in  the  validation  region.  Equation  (29)  is  referred  to  as  the  association 
probability . 


3.4.2  Probabilistic  Data  Association 

The  most  difficult  part  of  this  method  is  deriving  the  association  probabil¬ 
ities.  For  this  reason  we  will  deal  with  them  first.  Equation  (29)  can  be 
written  alternatively  as 


/3ki=p{9ki\Zk}=p{eki\Z(k),mk,Zk-1}  (30) 

Using  Bayes’  rule,  this  can  be  rewritten  as 

Pk  =  \p  [Z{k)\ek\mk,Zk-i]  p{6kl\mk,Zk-i},  i  =  0,l,...,mk  (31) 


Now  we  will  derive  the  first  PDF  from  (31).  The  PDF  of  a  false  measurement 
is 


p 


zkl\°k  ,mk,zk-i 


i 

Vh 


(32) 


where  i  ^  jl ■  The  PDF  of  the  correct  measurement  is 


p[zkl\ek\mk,Zk-i]  =  Pg1N  (*fc*;4|k-i.Sfc) 

=  pyvp^o,^) 

=  pGl: — exP  { -0-3/4  Tsk  1  vk  }  (33) 

\2nSk\i 

where  J\f  [yk\  0,  Sk)  is  the  normal  PDF  with  the  argument  vkl,  a  mean  of 
zero  and  a  covariance  matrix  of  S&.  The  gate  probability  Pq  is  given  in  (15) 
and  Vk  is  given  in  (14).  After  putting  the  last  two  expressions  together2  we 
get 


P  [Z{k)\6k\mk:Zk_i] 


y-mk+lp-lM  Q>  Sk)  ,  i  =  l,...,mk 

Wmfc,  1=0 

(34) 


lrThis  is  based  on  the  assumption  that  the  clutter  is  uniformly  distributed 

2  and  by  assuming  that  the  entire  set  of  measurements  is  conditionally  independent 
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The  second  PDF  from  (31)  little  more  complex.  The  second  PDF  can  be 
shown  to  be 


Y(rnk)  =  P  {9kl\mk,  Zk_i) 

=  P{Okl\mk] 

PdPg  \PdPg  +  (1  -  PdPg 

VF(rnk) 


mk 

(1  -  PDPG ) 


-1 


/iF(rnk-l) 


liF(mk-l) 

PDPG  +  (1  -  PdPg )(UF(mfc_1) 


IPF(rnk) 


1  -1 


i  = 
i=0 


(35) 


where  Pd  is  the  probability  of  detecting  the  target  and  /ip(^fc)  is  the  PMF 
of  the  number  of  false  measurements  described  in  section  3.3.  The  derivation 
of  (35)  is  a  little  bit  technical.  Based  on  the  assumption  that  there  can  be 
at  most  one  valid  measurement,  let’s  denote  the  number  of  measurements 
as  mk  and  it’s  random  variable  as  mT .  If  the  random  number  of  false 
measurements  is  denoted  as  mF ,  then  its  value  is  m k  —  1.  Now  we  can  show 
how  to  calculate  (35). 


7*0*0  = 


7) 

?r  ^ 

ii 

mk} 

II 

mk  - 

'T' 

1,  m  = 

1 

II 

1  \mT  =  ' 

+P{G\mF 

=  rnk, 

ii 

=  m^m1 

"  =  rnk) 

f  —P{mF 

J  171  k  ^ 

=  rnk 

—  1  \mT 

=  mk }  +  0 

•  P  {  mF 

—  mnk\mT 

\  0  •  P  { mF 

=  rnk 

—  1  \mnT 

=  mk }  +  1 

•  P  { mF  - 

-  mk\mT 

=  mk  }  , 


i  =  l,...,mk 
i  =  0 

(36) 


In  the  case  where  there  is  only  one  valid  measurement  by  using  Bayes’ 
formula  we  get 


P  [mF  =  mk  —  1| mT  =  mk}  = 


P  {mT  =  mk\mF  =  mk  —  l}  P  {mF  —  mk  —  l} 
P  {mT  =  mk} 

PoPGUFipik  ~  1) 


P  {mT  =  mk} 


(37) 


PGPd  represents  the  possibility  that  the  target  has  been  detected  and  falls 
within  the  gate. 

In  the  complementary  case  where  all  of  the  measurements  are  false  mea¬ 
surements  we  get 


P  [mF  —  mk\mT  —  mk}  = 


P  {mT  =  mk\mF  =  m&}  P  \mF  =  mk} 
P  {mT  =  mk} 

(1  -  PdPg )  /Om*0 


P  {mT  —  mk} 


(38) 
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where  the  denominator  for  both  (37)  and  (38)  is  given  by 

P  [mT  —  rrik}  =  P  {mT  —  rrik\mF  —  m k  —  l}  P  {mF  =  m k  —  l} 

+P  }mT  =  rrik\mF  —  rrik}  P  }mF  —  rrik} 

=  PDPGl^Fipik  ~  1)  +  (1  -  PdPg )  VF(rrik)  (39) 

By  combining  equations  (37)- (39)  and  inserting  them  into  (36)  we  get  (35). 

The  next  stage  is  to  insert  the  expressions  for  fip  into  the  PMF.  In 
section  3.3  we  had  two  different  models.  The  first  one  was  the  parametric 
model  and  the  second  the  non-parametric  model.  The  non-par ametric  is 
simpler  and  more  robust  and  has  better  results  in  practice  according  to  [7]. 
So  by  inserting  (25)  into  (35)  we  get  the  following  result 


7  l{rnk) 


^PdPg,  i  =  l,...,mk 

1  —  PdPg,  i  =  0 


(40) 


We  went  through  all  of  the  mathematics  above  in  order  to  calculate  (3kl . 
By  inserting  (40)  and  (34)  into  (31),  we  get  (3  up  to  a  normalizing  constant 


Pk 


where 

e*  =  exp  {-0.5i/£  TS^1ukt} 

Since  equation  (41)  is  correct  up  to  a  normalizing  constant  it  can  be  written 
as 


OC 

(  Vk-mk+1PG1—±-rei-±-PDPG, 

i  |27rSfc|2 

l  V~mk  ■  (1  -  PDPG ) ,  i  =  0 

•  5  rrik 

OC 

1  Vk-mk+1%AT(Pk,0,Sk),  i  =  l,...,mk 

1  v-rnk  ■  (i  -  pdpg)  ,  i  =  o 

(41) 

n  i  1 

k  ^  l  VG\ 2irSk\i-mk — T 


i  =  1, . . .  ,mk 


hiPoPo  i  =  Q 


OC 


i  =  l,...,mk 

(f)nz/2  Cn}mk1~=%p,  i  =  0 

To  summarize,  /3k  is  calculated  by  the  following  expression 

i  =  l,...,mk 


Pk  = 


6+Epe, 


i  —  0 


where 


nz/  2 


_i  1  —  PdPg 

Cnz  ™k - - 


(42) 


(43) 


(44) 
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3.4.3  The  state  estimation 


Returning  to  equation  (28) 

mk 

xk\k  =  ^^xk\k^k  (45) 

i= 0 


we  now  know  how  to  calculate  (3lk  and  now  it’s  time  to  see  how  to  calculate 
the  updated  state  The  current  state  conditioned  on  the  past  measure¬ 
ments  and  on  the  zth  measurement  being  correct  is 


* k\k  =  ®k|k-l  +  Mkvk*  (46) 

where  the  innovation  is  defined  as 

vk  =  zk  -  4|fc-i  (47) 

The  gain  M \  is  the  same  as  in  the  standard  filter 

Mk  =  'Zk\k_1E%S?  (48) 

for  i  =  0  or  when  =  0  we  have 

Xk\k  xk\k— 1  (49) 

By  inserting 

*fc|fc  =  £k\k- 1  +  Mkvk*  (50) 

where  the  combined  innovation  is 

mk 

Vk  =  J2  Pkvk  (51) 

2=1 


The  updated  covariance  matrix  is 

E, k\k  =  E, fe|fc_i  -  [1  -  4°]  MkSkM%  +  Ek  (52) 


where  the  extra  term  in 


Si k  =  Mk 


mk 


^Pkvhvk  T  ~ 


T 

vk”k 


,i=l 


Mi 


(53) 


accounts  for  the  extra  uncertainty  added  by  not  knowing  which  measurement 
is  the  correct  measurement.  The  proof  for  this  can  be  found  in  [7].  The 
prediction  equations  are  the  same  as  in  the  standard  Kalman  filter. 


xk+l\k 

Fk%k\k 

(54) 

^k+l\k 

H k-\-l%  k+l \k 

(55) 

Sfc+l|fc 

=  ^feEfc|  kFk  +  Qk 

(56) 

$k+l 

=  Hk+iEk+i\ kHk+i  +  Rk+i 

(57) 
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The  likelihood  of  the  current  set  of  measurements  filter  given  the  previous 
measurements  is  defined  as 

A (k)  =  P  {Z(k)\rrik,  Zk_i} 

=  P{z1(k),...,zmk(k)\mk,Zk-1} 

=  P{v1(k),...,vmk(k)\mk,Zk-1} 

mk 

=  U'mfc7°(mfc)  +  U_mfc+1EPG1^'(^;0^d7j(mfc)  (58) 

3= 1 

For  the  convenience  of  the  reader  a  summary  of  the  PDAF  algorithm  is 
provided  in  Table  2. 

Comments: 

•  This  method  can  be  further  improved  by  generalizing  the  method  to 
incorporate  the  feature  or  intensity  of  the  target  in  the  calculation  of 
the  probabilities. 

•  The  method  relies  on  the  assumption  that  the  track  has  been  initial¬ 
ized.  In  the  next  section,  a  method  for  track  formation  with  clutter 
will  be  discussed. 


The  PDAF  Algorithm 


Prediction  Equations 


£k\k  —  l 

Fk-i%k-i\k-i 

%k\k-l 

Hkxk]k-i 

-A-|A-  -1 

II 

*9 

1 

M 

?r 

1 

IF 

1 

Sk 

—  HkVk\k-iHZ  ■ 

Validation  of  measurements 

m  = 

is  the  set  of  measurements  that  fall  in  the  region  defined  by 

Vk( 7)  =  {z  ■  Vk  (■ sk )  _1  Vk  <  7} 

where 

vk  =  zk  -  4|fc-i  i  =  1,. . . ,  mk 
and  rrik  is  the  number  of  validated  measurements. 

Calculate  f3kl  i  =  1, . . . ,  m k 

u  a  f  ‘^7r\ "*/2  -1  1  -PdPg 

b={ 


e»  =  exp  {—0.54  TSk  1vk1}  i  =  l,...,mk 

/V  = 


i  =  1,  •  •  •  ,mfe 


b+E,ii  e. 

_ & _  s  —  0 

&+£7=W  *_U 


Measurement  update 

If  rrik  =  0  then: 
State  update: 


Else: 


Covariance  update: 

Combined  Innovation: 

Kalman  Gain: 

State  update: 
Covariance  update: 


Xk\k  %k\k  —  l 


^k\k  —  £fc|fc-l 


mk 

=  E  Pkvk 

i=  1 

Mk  =  1 

'^7.‘  /.■  -  i  "t-  Mki'k 


Efe  —  Mk 


E /?*>*>**  T  -  vkvk 


i= 1 


Sfcife  —  —  [l  —  4°.  MkSkMk  +  E/e 


Table  2:  Summary  Of  The  PDAF  Algorithm  (One  Cycle) 
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4  Tracking  a  low  SNR  target  with  background 
clutter 

4.1  Introduction 

Algorithms  for  track  initiation  can  be  divided  into  two  different  categories: 

•  Non-Bayesian/Logic  based  methods 

•  Bayesian  methods 

The  first  approach  is  described  in  [7]  in  sections  2.6  and  7.4.  The  second 
approach  can  be  done  either  by  using  the  Optimal  Bayesian  approach  as 
described  in  [7]  section  3.5  or  by  using  IMM-PDAF  (Interactive  Multiple 
Model)  filter  from  section  4.4.  We  describe  in  the  second  approach  using  the 
IMM-PDAF  filter  for  the  track  formation.  Before  that  we  give  a  brief  outline 
of  the  different  models  that  are  used  to  describe  the  behavior  of  manoeuvring 
targets.  After  that,  we  give  a  brief  outline  of  the  IMM  algorithm  and  then 
describe  how  to  form  a  track  with  this  method. 

4.2  Modeling  the  behavior  of  a  manoeuvring  target 

I  will  give  a  brief  outline  on  some  of  the  more  popular  models  used  for 
tracking  manoeuvring  targets.  This  is  in  no  way  exhaustive  and  in  some 
cases  more  exotic  models  are  required. 

The  most  basic  and  simple  model  is  that  of  a  stationary  target.  For  the 
sake  of  simplicity,  we  analyze  the  problem  in  two  dimensions.  This  does 
not  limit  the  generality  of  the  model  as  one  can  easily  extend  the  model  to 
higher  dimensions. 

Our  proposal  for  the  stationary  model  is  the  following: 

%k+i  —  Fxk  +  Gwk  (59) 

zk  =  Hzk  +  vk  (60) 

where 

wk  ~  N  ( 0,  qkInx)  (61) 

where  nx  is  the  dimension  of  the  state  and 

rk  ~  J\f  (0,  rkIUz)  (62) 

where  nz  is  the  dimension  of  the  measurement.  The  SS  (State  Space)  ma¬ 
trices  are  given  by 


F  = 

h 

(63) 

H  = 

h 

(64) 

G  = 

1 

(65) 
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and  the  state  includes  only  the  position  of  the  target 


x(k) 

ry(k) 


where  rx  corresponds  to  the  position  on  the  x-axis  and  ry  corresponds  to 
the  position  on  the  y-axis.  Usually,  one  will  assume  that  qk  is  very  low  and 
constant  in  time.  This  describes  a  target  that  wobbles  around  itself  and 
mostly  stays  in  the  same  place3. 

The  next  model  is  called  the  Uniform  Motion  Model  or  White  Noise 
Acceleration  Model  and  has  been  described  before: 


rx(k  +  1)  =  rx(k)  +  Tw^ 


(67) 


where  rx  denotes  the  speed  on  the  x-axis.  The  equations  for  the  position 
and  speed  on  the  x-axis  are 


rx(k)  +  T  • 

Av.  Speed 

(68) 

rx(k)  +  7T 

•  (rx(k- 1-1)  +  rx(k)) 

(69) 

rx(k)  +  ^ 

•  (2rx{k)  +  Tw{k)^ 

(70) 

rx(k)  +  T  ■ 

rx(k )  + 

(71) 

The  equations  for  position  and  speed  on  the  y-axis  are  similar.  To  summarize 
the  matrices  for  the  SS  equations  (59)  and  (60)  are  given  by 


F 


G 

H 


10  0  0 
T  1  0  0 

0  0  10 
0  0  T  1 

T  0 

T2/ 2  0 

0  T 

0  T2/  2 

0  10  0' 

0  0  0  1 


where  the  state  vector  is 


Xk 


rx(k)  " 
rx(k ) 
fy(k) 
ry{k )  . 


(72) 


(73) 

(74) 


(75) 


3  Note  that  one  is  assuming  that  the  white  noise  input  is  piece-wise  constant  if  this 
is  not  the  case,  one  has  use  the  method  described  in  [5]  section  2.3  and  section  2.7  for 
approximating  a  continuous  SS  (state  space)  model  by  a  discrete  SS  model 
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For  an  accelerating  target,  the  most  common  model  used  is  referred  to 
as  the  Wiener  Process  Acceleration  Model.  By  augmenting  the  state  vector 
with  a  variable  for  the  acceleration,  we  get  here  a  third  order  model  with 
the  following  matrices: 


F 


G 


H 


1  0  0  0  0  0  ' 

T  1  0  0  0  0 

T2/2  T  1  0  0  0 

0  0  0  1  0  0 

0  0  0  T  1  0 

0  0  0  T2/ 2  T  1  _ 

1  0  ' 

T  0 
T2/ 2  0 

0  1 
0  T 
0  T2/ 2  _ 

001000' 

0  0  0  0  0  1 


where  the  state  vector  is 


Xk 


ax(k)  ' 
rx(k) 
rx(k ) 

ay(k) 

ry(k) 
rv(k)  . 


(76) 


(77) 


(78) 


(79) 


A  description  of  these  models  can  be  found  [5]  section  2.7.  More  sophis¬ 
ticated  models  such  as  the  Singer  Model  for  maneuvering  targets  with  a 
colored  noise  input  [8]  or  James  Helferty’s  Turn-Rate  Distribution  Model  [  | 
can  be  used  if  necessary,  but,  for  this  particular  problem  dealt  with  in  this 
project,  the  first  3  models  should  suffice. 

Another  model  worthy  of  mentioning  is  the  Coordinated  Turn  Model. 
This  model  is  especially  useful  for  Air  Traffic  Control  where  civilian  aircraft 
perform  turns  of  a  constant  radius  before  landing.  A  description  of  it’s 
application  is  given  by  Bar-Shalom  and  Li  in  [10].  A  brief  survey  of  some  of 
the  models  mentioned  here4  is  given  by  Lerro  and  Bar-Shalom  in  [1  ]. 


4.3  The  Interacting  Multiple  Model 

When  there  is  uncertainty  regarding  the  type  of  model  and  the  parameters 
of  the  model,  it  is  necessary  to  use  multiple  models.  For  example,  take  a 

4 Except  for  the  Singer  Model  and  Helferty  Model 
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target  that  is  driving  along  a  road  and  then  leaves  the  road  and  starts  driving 
across  rough  terrain.  One  would  require  two  models  in  this  case.  The  IMM 
(Interacting  Multiple  Model)  is  a  hybrid  approach  that  has  an  effect  of  soft 
switching  between  different  models.  Its  performance  is  comparable  to  other 
algorithms  such  as  the  GPB  (Generalized-Pseudo-Bayesian)  approach  and 
it  is  shown  by  Bar-Shalom  to  be  more  efficient  [7]. 

The  hybrid  system  is  not  surprisingly  formulated  by  a  set  of  linear  SS 
equations  corresponding  to  each  individual  mode  or  model. 

x{k  +  1)  =  F{k,  Mk)x{k)  +  G(k,  Mk)w{k,  Mk)  (80) 

z(k)  =  H(k,Mk)  +  v(k,Mk)  (81) 

where  Mk  is  the  true  mode  at  time  k.  The  state  noise  w  (k,Mk)  and  the 
measurement  noise  are  uncorrelated  with  the  Gaussian  initial  state  x(0); 
they,  are  mutually  uncorrelated  white  Gaussian  noise  vectors  with  the  co- 
variance  Q  (/c,  M.j )  and  R  (/c,  Mj )  respectively. 


The  idea  behind  the  algorithm  is  to  use  homogeneous  Markovian  tran¬ 
sition  system  where  the  probability  of  moving  from  one  mode  to  another 
is 

P  {M(k  +  1)  =  Mj\M(k)  =  Mi}  =  TTij  Vi,  j  e  M  (82) 

where  M  denotes  the  set  of  all  modes  assumed  to  be  in  the  MM  (Multiple 
Model)  scheme  and  M(k)  represents  the  mode  at  time  k. 


Now  I  will  describe  the  conventions  used  in  this  model 


Xj{k\k),  Pj(k\k) 

XQj  (k\k), 

Poj  m 
x(k\k),  P(k\k) 

w(k 0 

N\j(k) 

A#) 


state  estimate  of  the  mode-matched  filter  j  at  k  and  its  covariance 

mixed  condition  for  the  mode  matched  filter  j  at  k 

combined  state  estimate  and  its  covariance 

mode  probabilities  for  filter  j  at  time  k 

the  probability  of  going  from  mode  i  to  j  at  time  fc, 

the  likelihood  function  of  mode  matched  filter  j 


The  idea  behind  the  IMM  algorithm  is  simple.  Say  we  have  two  different 
models.  At  every  different  stage,  the  number  of  possible  options  is  2.  Either 
the  target  is  in  model  1  or  model  2.  Then  for  the  next  stage  the  target 
can  be  in  either  model  1  or  model  2.  As  the  process  continues  the  number 
of  possibilities  grows  exponentially.  In  this  specific  case  of  two  models,  the 
number  of  possibilities  grows  at  a  rate  of  2k  like  a  binary  tree.  In  order  to 
avoid  this,  one  has  to  make  use  of  sub-optimal  algorithms.  The  IMM  does 
this  by  a  mixing  stage  before  individually  applying  the  Kalman  filter  to  each 
stage.  This  is  demonstrated  in  Figure  4. 
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Interacting  Stage 


Filtering  Stage 


Xj  (k  |  A:) 


(k  I  k) 


Figure  4:  IMM  diagram 

More  details  and  background  on  this  algorithm  are  discussed  by  Blom  in 
[11]  and  by  Bar-Shalom  and  Li  in  [12]  and  section  1.5.4  of  [7].  The  algorithm 
is  described  in  Table  3. 

When  plugging  the  PDAF  algorithm  into  the  IMM,  one  has  to  replace 
Equation  (92)  with  the  expression  for  the  combined  innovation  for  the  PDAF 
from  (51).  The  likelihood  function  from  (95)  is  replaced  with  the  expression 
from  (58).  The  association  event  probabilities  (3  are  calculated  by  using 
equation  (41)  instead  of  (42).  This  is  necessary  in  the  event  of  Pd  =  0. 
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Interaction  (Vj  E  M): 

•  predicted  mode  probability: 

Vj  =  =  22n -  1) 

i 

(83) 

• 

mixing  probability: 

Vi \j  =  P{Mi(k  -  1  )\Mj(k),Zk_1}  =  7 TijHi(k  -  l)/Vj 

(84) 

•  xQj  (k  -  l\k  -  1)  =  E  [x(k  -  1)|  Mj{k),  Zk_  i]  =  22%i(k  -  1|  k  -  l)vi\j 

i 

(85) 

•  Pojik  -  1| k  -  1)  =  22 pi(k  ~  1| k  -  l)vi\j  +Xj 

i 

(86) 

• 

where  the  ”spread-of-the-means”  term  in  the  mixing  is 

Xj 

=  22  —  1 1 A;  —  1)  —  xQj(k  —  l\k  —  1)]  [xi(k  —  l\k  —  1)  —  xGj(k  —  1  \k  —  1)]T 

Pi  j 

i 

(87) 

Filtering  (Vj  E  M): 

•  state  prediction: 

1"H 

1 

1 

0 

1 

II 

1 

(88) 

• 

covariance  prediction: 

Pj(k\k  -  1)  =  Fj(k  -  1  )Poj(k  -  1| k  -  1  )Fj(k  -  1  )T  +  Gj{k  -  1  )Qj(k  -  1  )Gj{k  -  1)T 

’  (89) 

• 

residual  covariance: 

Sj  =  HjPjQt \k  -  1  )Hj  +  Rj 

(90) 

• 

filter  gain: 

Wj=Pj(k\k-l)Hjsr1 

(91) 

• 

residual: 

vj  =  z(k)  —  HjXj(k\k  —  1) 

(92) 

• 

state  correction: 

Xj(k\k)  =  Xj  (k\k  —  1)  +  WjVj 

(93) 

• 

covariance  correction 

Pj(k\k)  =  Pj{k\k  -  1)  -  WjSjWj 

(94) 

• 

likelihood  function: 

A j  =M{vj-0,Sj)  =  |27r5j|_1/2exp 

(95) 

• 

mode  probability: 

Vj  A? 

3  EiWAi 

(96) 

Combination: 

•  x  =  E  [x{k) \Zk\  =  Xj  ( k\k)fij 

i 

(97) 

•  P(k\k)  =  E  ^[x(k)  —  x(k\k)][x(k)  —  x(k\k)]T  \Zk^  =  Pj {k\k)yj  +  X 

i 

(98) 

• 

where  the  “spread-of-the-means”  term  in  combination  is 

X  =  22  [xi(k\k)  —  x(k\k)]  [: Xi(k\k )  —  x(k \k)]T  m 

(99) 

i 


Table  3:  Summary  Of  The  IMM  Algorithm  (One  Cycle) 
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4.4  Combining  The  IMM  with  the  PDAF 

In  order  to  apply  the  IMM-PDAF  algorithm  to  the  problem  of  track  initi¬ 
ation,  two  models  are  used:  “observable  target”  (true  target), All,  an  “un¬ 
observable  target”  (no  target)  model,  Afo-  Both  filters  are  Uniform  Motion 
Models.  The  difference  between  them  is  that  in  the  “unobservable  target” 
model  Pd  —  0.  The  TTP  (True  Target  Probability)  is  used  to  discriminate 
between  false  tracks  and  the  real  track.  The  TTP  is  defined  as 

TTP(k)  =  (100) 

where  fii(k)  is  the  mode  probability  for  the  “observable  target”  model. 
There  are  two  different  ways  to  implement  this.  The  first  is  the  fixed  window 
implementation.  The  second  is  the  sliding  window  implementation,  when 
unvalidated  measurements  are  used  to  initiate  new  tracks.  This  is  better, 
since  when  the  algorithm  locks  on  to  the  wrong  track  there  is  a  chance  of 
switching  back  to  the  right  track.  Another  way  to  improve  the  results  is  to 
include  the  Amplitude  Information.  This  will  be  discussed  in  later  sections. 

The  process  of  track  formation  in  fixed  window  implementation  consists 
of 

1.  Track  Pair  Initiation:  Initial  pairs  of  measurement  are  created  by 
using  the  first  two  scans  or  images.  The  pairs  are  further  pruned  by 
using  the  following  criterion 


and 


^(l)-^(l)  <^lmaxT  +  2^Ril 
(2)  -  z{  (2)  I  <  ^2maxT  +  2 Vi  ^  j 


(101) 


where  zk(  1)  is  the  ith  measurement  for  the  x-axis  (1)  at  time  k  and 
zk( 2)  is  the  ith  measurement  for  the  y-axis  (1)  at  time  k. 

2.  Tracking  Maintenance:  For  k  =  3, . . . ,  N\y  the  IMM-PDAF  algo¬ 
rithm  is  implemented  with  two  models  as  previously  explained.  In  this 
method  the  Track  before  Declare  or  Track  before  Detect  procedure  is 
used.  The  markov  transition  matrix  used  to  describe  the  transition 
between  models  is 


0.98  0.02 
0.02  0.98 


(102) 


The  track  with  the  highest  TTP  is  given  preference  when  assigning 
measurements  in  order  to  avoid  giving  the  same  measurement  to  dif¬ 
ferent  tracks.  The  gating  region  has  to  be  the  same  for  both  models. 
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Otherwise  one  cannot  perform  the  calculation  in  (96).  Both  models 
also  have  to  have  the  same  number  of  measurements  (by  union  of  all 
the  gates  for  each  individual  model).  In  practice  the  volume  of  the 
gating  region  is  taken  from  the  model  with  the  largest  volume.  When 
the  track  stops  getting  measurements  in  its  gate  region  the  covariance 
matrix  grows  and  with  it  the  gating  region.  This  is  because  the  gating 
region  grows  so  large  it  eventually  gets  flooded  with  measurements. 
To  prevent  this  from  happening,  tracks  with  too  many  measurements 
are  dropped. 

Tracks  with  a  TTP  below  a  certain  threshold  specified  by  the  user  are 
dropped.  In  the  sliding  window  implementation,  unassociated  mea¬ 
surements  are  used  to  initiate  new  tracks.  This  is  especially  important 
in  a  problem  with  a  low  Pjj. 

The  IMM-PDAF  algorithm  can  also  be  useful  for  tracking  a  manoeuver- 
ing  target  once  the  real  track  has  been  confirmed.  By  adding  on  additional 
models,  one  can  dramatically  increase  the  robustness  of  the  algorithm  and 
maintain  the  track.  Another  advantage  is  that  by  monitoring  the  TTP,  one 
has  a  fairly  reliable  criterion  for  terminating  a  track. 

4.5  Using  Amplitude  Information  in  the  IMM-PDAF 

It  is  possible  to  improve  the  reliability  of  the  tracking  by  adding  the  in¬ 
formation  about  the  amplitude  onto  the  measurement  vector.  The  set  of 
measurements  is  obtained  by  thresholding  an  image;  thereby  one  obtains  a 
set  of  measurements  with  varying  amplitudes.  If  one  takes  into  account  only 
the  position  and  not  the  amplitude  one  loses  valuable  information.  By  using 
both  one  should  expect  a  drastic  improvement  in  the  performance.  First  a 
description  of  the  PDAFAI  (PDAF  Amplitude  Information)  will  be  given  as 
described  in  [7]  and  [13].  Secondly  I  will  explain  in  detail  how  to  combine 
this  algorithm  with  the  IMM  algorithm. 

4.5.1  PDAFAI 

The  statistical  information  about  the  amplitude  can  be  incorporated  into  the 
PDAF  algorithm  in  the  following  manner.  One  assumes  that  the  following 
information  is  available: 

•  po(a)  =  PDF  of  the  amplitude  it  is  due  to  noise  only 

•  Pi  (a)  —  PDF  of  the  amplitude  if  it  originated  from  the  target 
The  simplest  model  for  the  signal  amplitude  is 

Hq  .  CLk  —  Tlfc 

:  ctk  =  s  +  rik 
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where  Hq  is  the  hypothesis  for  no  target,  H\  is  the  hypothesis  for  a  target,  s 
is  the  real  amplitude  and  n &  is  assumed  to  be  i.i.d.  additive  white  gaussian 
noise  with  a  variance  of  cr^.  Then,  we  use  the  maximum  likelihood  to  es¬ 
timate  the  variance  of  the  pixels  in  the  first  frame.  As  each  frame  in  the 
movie  is  pre-processed  and  normalized  the  background  noise  does  not  vary 
much  over  time.  If  the  signal  strength  is  not  known  (and  this  usually  is  the 
case)  then  one  can  estimate  it  by  taking  the  maximum  pixel  in  each  frame 
defined  as 

Sk  —  ma  x{uk(m,n)}  (104) 

m,n 

where  Uk(m,ri)  is  a  pixel  at  index  (m,n)  at  time  k.  Then  the  estimated 
signal  strength  is 

1  K 

s  =  (105) 

iV  1=1 

This  is  a  weak  assumption  but  it  works  very  well  for  real  data.  When  there 
is  no  target  then  one  gets  a  signal  at  the  height  of  the  noise.  If  the  noise  is 
white  then  the  tracks  that  are  initiated  eventually  get  killed  off  during  the 
tracking  process.  Then  the  probability  functions  are  assumed  to  be 

•  Po(a)  =  M  (a;  0,  er^) 

•  pi(a)  =  JV  (a;  s,a%) 

It  is  assumed  that  the  strength  of  the  signal/amplitude  s  is  known  or  more 
or  less  constant;  therefore,  the  estimate  is  fairly  good.  In  the  detection 
problem  where  one  is  tracking  a  point  target  with  a  fluctuating  amplitude 
strength,  a  slight  modification  is  required.  In  Fig.  5  there  is  an  example 
of  the  signal  strength  in  an  Irena  movie.  The  method  for  producing  these 
movies  is  explained  in  [  ].  It  is  in  fact  the  maximum  pixel  in  each  frame. 

One  can  see  that  signal  is  a  combination  of  a  periodic  signal  that  has  a  period 
of  8  frames  and  a  gradually  increasing  average  that  changes  over  time.  The 
periodic  signal  can  be  explained  by  the  nature  of  the  signal.  It  is  spread 
over  3  or  4  neighboring  pixels.  When  the  center  of  target  moves  from  one 
pixel  to  the  next,  its  intensity  is  spread  evenly  between  a  few  pixels  and  it  is 
at  its  lowest  intensity.  When  the  center  of  the  target  is  in  the  center  of  the 
pixel,  its  intensity  is  the  highest.  In  the  synthetic  data  here,  the  target  is 
traveling  at  a  speed  of  1/8;  the  results  fit  our  theory.  The  changing  average 
can  be  explained  by  clouds  entering  and  leaving  the  picture.  A  better  model 
for  this  case  would  be 


Hq  .  CLfc  —  Tljz 

Hi  :  =  s  + 


(106) 


where  £&  is  a  i.i.d.  white  gaussian  noise  and  is  uncorrelated  with  n The 
variance  of  £&  is  estimated  by  taking  the  sample  variance  over  The 


31 


Figure  5:  The  signal  amplitude  taken  from  an  Irena  movie 


variance  is  calculated  by 


dl  =  (107) 

1=1 

Finally  the  PDF’s  for  both  the  target  present  hypothesis  H\  and  no  target 
hypothesis  Hq  is  assumed  to  be 

•  p0(«)  =  M  (a;  0,  al) 

•  pi  (a)  =  N  (a;  s,  +  afj 

In  practice  the  best  results  are  obtained  by  using  the  moving  average  of 
the  estimated  signal  amplitude.  Take  for  example  the  following  estimated 
signal  amplitude  and  its  moving  average  for  10  samples  in  Fig.  6 

Another  reason  why  this  is  better  is  because  one  can  implement  this 
method  online  while  scanning.  The  previous  model  with  the  added  noise  £ 
has  to  be  carried  out  offline.  The  estimation  of  the  signal  variance  cr|  can 
only  be  carried  out  after  scanning  the  whole  video.  In  this  case  the  PDF’s 
for  both  the  target  present  hypothesis  Hi  and  no  target  hypothesis  Hq  is 
assumed  to  be 
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Figure  6:  The  solid  blue  line  is  the  estimated  signal  amplitude  and  the 
dashed  green  line  is  it’s  moving  average 


•  Po(a)  =V  (a;  0,0-2) 

•  pi  (a)  =  N  (a;sMA(k),a%) 

where  the  moving  average  is  calculated  by 

1  ° 

SMA(k)  =  —  J2  s(0  (108) 

U  l=-10 

After  finding  an  appropriate  model  for  the  problem,  the  probability  of 
detection  Pjj  and  the  probability  of  a  false  alarm  Pfa  are  calculated  with 
the  following  equation: 


Pd  = 

POO 

/  pi(a)da, 

J  T 

a  >  r 

(109) 

Pfa  = 

roc 

J  p0{a)da, 

a  >  r 

(110) 

After  applying  the  threshold,  the  resulting  distributions  for  “target”  and 
“no-target”  measurements  are 


33 


Pi («)  = 

Po(a)  =  -p^-po(a) 


(111) 

(112) 


The  additional  information  about  the  amplitude  will  be  included  in  the 
algorithm  by  rewriting  equations  (32)  and  (33). 


P 


zk\6k =^p5K2) 


i 


where  i  ^  j-  In  order  to  simplify  the  calculations  define  the  ratio 

p\{<) 


Po(c 


(113) 


(114) 


The  PDF  of  the  correct  measurement  is 


p[zkl\9kl,mk,Zk- 1]  =  PG1A/'(2:A.*;4|fc_i,S,fc)p];(ai.J 


=  P 


-l 


1 


G  T - -r  exp  {-0.54  T5fc1^}p[(a^) 

1 27T S'fc  I  2 


(115) 


and  by  inserting  equations  (113)  and  (115)  into  (34)  we  get  the  following 


r  n  I  Wmfe+1^^(^;0,5fc)K(a;)nr=fe1P5(V),  i  =  l,- 

p[Z{k)\dh\mh,Zk-X]  =  {  . 

\  vrkmiPoM,  ^=0 

f  V-^+1PD^Ar(^;0,Sk)\nTJiPToM,  i  =  l,...,mk 

ICIpi  PToM,  i  =  0 

After  inserting  this  into  equation  (31)  and  after  a  bit  of  normalizing  one  gets 


/V  =  l  b+E^ejxj 


mk  „  ,\  .5  i  —  1,...,  rrik 


,  i  —  0 


where 


and 


*+E£?i^V 

ei  -  PGlj^  {vk,®->Sk) 


(117) 


(118) 


,mfc 

(116) 
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(119) 


The  likelihood  of  the  filter  is  given  by 


A  (k)  4  P{Z(k)\mk,Zk-X] 

=  P{zi(k),...,zmk(k)\mk,Zk_i} 
=  P{i/i(k),...,vmk(k)\mk,Zk-i} 


=  Vk  mfe7°("ifc)  Ylpo(akJ) 


+vk  mk+l\\Mak A 


j= 1  1=1 


14  mfc7°("ifc)  n^°(V) 


+V  mfc+1  UpoM  Yj  PGlejXj^(mk)  (12°) 


3= 1  j=l 


where  7 j(m^)  is  given  by  the  nonparametric  clutter  model  from  (40)  and 
hereby  the  final  result  is 


(121) 


4.5.2  The  IMM-PDAFAI  Algorithm 

In  the  implementation  of  the  IMM-PDAFAI,  a  few  modifications  are  neces¬ 
sary.  The  first  is  that  the  same  number  of  measurements  is  required  for  all 
models.  This  is  done  by  using  a  common  gating  region  or  a  union  of  all  the 
gates.  In  practice  this  is  done  by  taking  the  gate  with  the  largest  region. 
Ideally,  one  would  apply  the  PDAFAI  separately  to  each  model;  however 
the  likelihood  functions  for  each  model  have  to  have  the  same  number  of 
measurements  in  order  to  apply  the  equations.  The  likelihood  function  is 
calculated  for  Mj(k ),  model  j,  at  time  k 
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A#) 


^  P{Z(k)\Mj(k),mk,Zk_l) 

mk 

=  V^mk^°(mk)Y[p0(akl) 

1=1 

mk  mk 

+V~mk+1  IIMO  5^  (Vt  0,  A a\mk) 

1  =  1  1  =  1 

(122) 


where 


V  =  4  -  Vi  423) 

where  z\  is  the  Ith  measurement  at  time  k  and  the  measurements  are  the 
same  for  all  of  the  models.  The  prior  estimated  measurement  for  the  jth 
model  is  zJk_v  The  measurement  covariance  matrix  for  the  jth  model  is 

given  by  SkJ . 

To  put  it  in  a  slightly  different  and  more  comprehensible  manner 


1 

f  fe  +  E5  PcF°ir>+‘  n"&  nW) 

,  “target  model” 

A j(k)  = 

1 

1  V-^UTAPo(akj), 

“no-target  model” 

(124) 

where 

e:hl  =  PGlM  (If;  0,  S',/)  i  =  1, . . . 

,mk  (125) 

and 

,  a  1  —  PdPg 

PgPd Vi 

(126) 

A  summary  of  the  IMM-PDAFAI  for  the  filtering  stage  is  given  in  Table 
4.  The  rest  of  the  algorithm’s  stages  are  the  same  as  in  Table  3. 


4.5.3  Track  Formation  with  the  IMM-PDAFAI  Algorithm 

Lets  assume  that  one  is  using  the  sliding-window  version  of  the  algorithm. 
This  means  that  at  time  k  and  k  +  1  unclaimed  measurements  are  used  to 
initiate  new  tracks.  The  IMM-PDAFAI  is  employed  on  the  resulting  pairs 
from  k  +  2.  At  stages  k  and  k  +  1  where  there  is  no  original  uncertainty 
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Filtering  (Vj  G  M): 


Prediction  Equations 

•  state  prediction:  —  1)  =  Fj{k  —  1  )x0j{k  —  l\k  —  1) 

•  covariance  prediction: 

-  1)  =  -  l)Poj(fe  -  l|fc  -  1  )Fj(k  -  if  +  Gj(k  -  1  )Qfk  -  1  )Gj(k  -  1)T 


•  residual  covariance:  Sj  =  HjPj(k\k  —  1  )Hj  +  Rj 

•  filter  gain:  Wj  =  Pj(k\k  —  1  )Hj Sj1 

•  Gate  size  Vf.  =  cnz^nz/2  •  \Sk  |1/2 

The  largest  gate  region  is  defined  as  Vk  =  maxj  { Vk  0  } 

Validation  of  Measurements  (only  for  “no-target”  models) 

Z(k)  =  {***}“* 

is  the  set  of  measurements  that  fall  in  union  the  regions  dehned  by 

V2  =  {z:vt’iT[sk>y\ki’i<  7} 

where 

V’’  =  zk  -  4|fc-i  *  = 
and  nik  is  the  number  of  validated  measurements. 

Calculate  f3kl  i  —  1, . . . , vrik 


,  a  ffaf) 

Polak) 


i  =  l,...,mk 


hi 


A_ 


mk 


1  -  PdPg 
VkPGPD 


ejf  —  Pg 


1V(^i;0,V) 


/3°kl  —  <  L/’Gz  5 


i=li...,T7lk 
i  =  0 


Measurement  update 


If  mk  —  0  then: 

State  update:  Xj(k\k)  —  Xj(k\k  —  1) 
Covariance  update:  Pj{k\k)  =  Pj(k\k  —  1) 
Likelihood  function:  Vj  A j{k)  —  1 

If  Afo=“no  target”  and  M 1  —  ’target’  then 
A 0(k)  =  1  —  Pfa  and  Ai (k)  =  1  —  Pd 


Else: 

Combined  Innovation:  Vj(k)  —  YlT=i  ^k^k 
Kalman  Gain:  Wj(k)  =  Pj{k\k  -  1  )H^[SJk]~1 
State  update:  Xj(k\k)  =  Xj(k\k  —  1)  +  Wj (k)i/j (k) 

Covariance  update:  P(k)  =  Wj(k )  |^2i  fik'%vk'%vk%  T  —  Vj(k)vj(k)T j  Wj(k)T 

Pj{k\k)  =  - 1)  -  [1  - ^’°]  WjWsjWjikF  +  p(fe) 

Likelihood  function:  Equation  (124) 


•  mode  probability:  fij  =  ^  L  JA 


Table  4:  Summary  Of  The  Filtering  Stage  for  the  IMM-PDAFAI  Algorithm 
(One  Cycle) 
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for  an  initiating  pair,  the  likelihood  function  relies  only  on  the  amplitude 
information: 


Ai(fc)  =  p\(flk)  for  the  “target  model”  (127) 

Ao(fc)  =  Po(ak)  for  the  “no-target  model”  (128) 

The  initial  mode  probabilities,  for  an  initial  pair  at  time  fc,  are  assumed  to 
be 


/zi(fc)  =  0.5  (129) 

p0(k)  =  0.5  (130) 

where  the  mode  probabilities  are  updated  according  to  equations  (83)  and 

(96).  This  algorithm  is  not  in  fact  a  Multi-Target  tracking  algorithm,  but 

some  ad  hoc  changes  can  be  made  in  order  to  track  more  than  one  target 

when  it  is  known  that  there  is  only  one  target 

4.5.4  Common  Problems  with  the  Algorithm  and  their  Solutions 

No  initial  pairs  The  threshold  is  lowered  until  there  exist  legitimate  pairs 
to  start  the  process. 

Too  many  initial  pairs  The  threshold  is  raised  until  the  number  of  initial 
tracks  drops  below  a  predetermined  threshold.  Too  many  tracks  are 
a  problem  as  this  algorithm  isn’t  strictly  a  multiple  model  tracking 
algorithm.  This  algorithm  can  theoretically  track  more  than  one  target 
as  long  as  the  tracks  don’t  cross  over  each  other. 

All  of  the  tracks  get  killed  off  during  the  tracking  process  The  thresh¬ 
old  is  dropped  until  there  exist  initial  pairs  to  carry  on  the  tracking 
process. 

Tracking  gate  gets  flooded  with  measurements  for  a  single  track  When 
the  output  covariance  matrix  Sk  as  defined  in  Equation.  (11)  grows, 
so  does  the  gate  size.  Then,  the  gate  becomes  so  large,  it  takes  up 
all  of  the  measurements  including  measurements  that  belong  to  other 
tracks.  This  happens  because  the  more  measurements  that  enter  the 
gate  the  more  spread,  there  is  between  the  innovations,  consider  Equa¬ 
tion.  53.  Unfortunately,  this  ends  up  like  a  rolling  snowball.  Things 
can  only  get  worse.  The  simplest  solution  is  to  not  allow  tracks  with 
large  output  covariance  matrices  access  to  any  measurements. 

The  number  of  total  tracks  crosses  a  reasonable  number  When  there 
are  too  many  tracks  in  a  small  area,  they  start  stealing  measurements 
from  each  other.  That  means  that  even  legitimate  tracks  are  some¬ 
times  denied  access  to  their  measurements  and  are  killed  off  because  of 
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this.  Also,  a  large  number  of  tracks  is  an  unnecessary  computational 
burden.  A  large  number  of  tracks  typically  occurs  when  there  is  no 
target  at  all  or  when  the  target  gets  lost  in  the  clutter.  Our  strategy 
is  to  stop  the  tracking  process  when  this  happens.  In  order  to  take 
into  account  the  size  of  the  image,  the  criteria  for  exiting  the  tracking 
algorithm  is  the  following: 

ifpf  >  0.03  or  ImageSize  *  pg  >  100,  then  exit. 

This  exit  strategy  is  used  most  often  when  there  is  no  target  and  the 
algorithm  is  tracking  only  clutter. 


5  Hyperspectral  Movies 

The  main  focus  of  the  thesis  is  to  examine  the  effect  of  different  detection 
algorithms  and  assumed  target  signatures  on  the  tracking  process  when  ap¬ 
plied  to  Hyperspectral  movies.  Lets  assume  that  the  target  was  implanted  in 
a  manner  described  by  L.  Varsano,  I.  Yatskaer  and  S.R.  Rotman,  ” Tracking 
Point  Targets  in  Hyperspectral  Data”,  accepted  by  Opt.  Eng. 

u(m,  n)'  =  u(m,  n)  +  9t  (131) 

where  t  is  the  target  signal  vector  6  is  the  relative  intensity.  There  are 
different  types  of  detection.  The  first  is  when  the  signal  is  known;  in  that 
case,  the  matched  filter  is  used.  The  equation  for  the  matched  filter  is: 

MF(m,  n)  —  tTE-1  (u(m,  n)  —  u(m,  n))  (132) 

The  second  is  when  the  signal  is  unknown;  then,  we  use  the  RX  filter: 

i?X(m,  n)  —  (u(m,  n)  —  u(m,  n))TE-1  (u(m,  n)  —  u (m,  n))  (133) 

6  System  Design  and  Analysis 

Designing  a  system  is  a  bit  like  tailoring.  Anyone  with  a  bit  of  cloth  and 
a  needle  and  thread  can  make  a  suit,  but  it  won’t  necessarily  be  worth 
anything  if  it  is  not  suited  to  the  client.  There  are  a  bewildering  number  of 
possibilities  to  choose  from.  The  only  way  to  make  heads  and  tails  out  of 
the  designing  problem  is  to  have  a  trustworthy  method  for  comparing  and 
analyzing  the  systems  performance. 

I  suggest  two  different  score  methods.  The  first  is  simple.  If  the  real 
targets  track  is  known,  then  the  test  is  simple.  Usually  there  are  a  few 
possible  tracks  at  the  end  of  the  process.  Only  tracks  that  have  a  lifetime 
of  more  than  50%  are  taken  into  consideration.  Then  the  track  with  the 
highest  TTP  100  is  the  winning  track.  The  next  step  is  to  compare  the 
track  with  the  real  track.  This  is  done  by  calculating  the  MSE  between  the 
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estimated  track  and  the  real  track  position.  We  do  this  from  the  creation  of 
the  track  denoted  as  tbirth  until  the  end  of  the  tracking  process. 

1  N  / - 

MSE  =  —  yMn)  -rx(n))2  +  (ry(n)  -  ry(n))2  (134) 

n= ^birth 

where  rx(n)  is  the  real  position  on  the  x-axis  and  rx(n)  is  the  estimated 
position  on  the  x-axis.  The  same  applies  for  ry{n)  and  ry{n).  If  MSE  <  1, 
then  the  track  is  valid. 

The  second  method  is  based  on  the  TTP.  First,  using  the  real  track,  we 
find  the  real  track  if  it  exists.  The  TTP  of  the  real  track  is  termed  TTPtrue- 
If  TTPtrue  doesn’t  exist,  then  it  is  set  to  zero.  Then  the  maximum  of  the 
remaining  false  tracks  is  termed  max{TTPja/se}.  If  there  are  no  false  tracks, 
then  ma x{TTPfaise}  is  set  to  zero.  Then  the  criterion  is  the  ratio  between 
the  two.  However,  in  order  to  get  a  score  between  0  —  1  the  following  equation 
is  used 

-  arctan  ( - PT™e  \  )  (135) 

vr  \  max{TT  P false }  /  ^  7 

If  the  score  is  equal  or  larger  than  0.5,  then  the  tracking  is  successful;  a 
failure  occurs  for  scores  less  than  0.5. 

Because  the  results  are  dependent  on  the  scenario,  we  take  ten  different 
frames  and  repeat  the  process,  then  we  average  the  scores  from  both  tests. 
We  refer  to  the  first  method  as  the  success  rate  and  to  the  second  test  as 
the  score  rate. 

These  are  tests  for  when  the  target  is  known  to  be  in  the  picture.  When 
the  tracking  algorithm  is  applied  to  the  same  picture  with  no  target  im¬ 
planted,  then  ideally  no  false  tracks  show  up.  For  a  no-target  picture  we 
expect  a  score  or  success  rate  of  zero.  This  would  mean  that  no  high  value 
TTP  tracks  have  survived  for  over  50%  of  the  simulation. 

6.1  Tracking  Curves 

The  main  point  of  the  this  tracking  algorithm  is  to  track  targets  at  a  low  SNR 
level.  It  is  important  to  have  a  yardstick  to  compare  different  algorithms 
performance  or  even  the  same  algorithmic  performance  but  with  different 
parameters,  for  different  levels  of  SNR.  This  is  easy  with  simulated  data. 
The  target  is  inserted  at  different  levels  of  intensity.  For  a  higher  intensity 
one  expects  a  higher  average  score  or  success  rate.  The  difference  between 
algorithms  is,  first,  the  minimum  level  of  target  intensity  that  results  in  a 
score  or  success  rate  above  zero;  second,  the  rate  that  the  score  or  success 
rate  rises.  In  fig.  7  two  different  methods  of  collapsing  the  hyperspectral 
cube  are  compared.  The  first  is  the  dot  product  between  the  pixel  vector 
and  the  target  vector 

tT  (u(m,  n)  —  u(m,  n))  (136) 
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and  the  second  is  the  matched  filter  from  Equation.  (132). 


Score  versus  Target  Intensity  Sucess  rate  versus  Target  Intensity 


Target  Intensity  Target  Intensity 


(a)  (b) 

Figure  7:  The  TOC  curves:  (a)  The  average  score  rate  versus  target  inten¬ 
sity,  (b)  The  average  success  rate  versus  target  intensity 

It  is  clear  that  the  matched  filter  is  better  then  merely  projecting  the 
target  onto  the  target  signal. 
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7  Tracking  results  for  Synthetic  Data  Produced 
by  Irena  and  Louisa 

The  synthetic  data  here  is  taken  from  na23a.  It  was  converted  into  a  Hy- 
perspectral  movie  by  using  a  cloud  signature  and  sky  signature.  Research 
on  converting  IR  movies  into  hyperspectral  movies  has  been  done  by  Louisa 
Varsano,  continued  by  Irena  and  is  explained  in  [  4].  A  target  signal  was 
implanted  diagonally  in  time  in  the  movie.  The  target  signal  is  spread  over 
a  few  pixels.  Three  ways  of  collapsing  the  hyperspectral  movie  are: 

Test  2 

lT(u(m,  n)  —  u  (m,  n))  (137) 

Test  2  half 

tT  (u(m,  n )  —  u(m,  n))  (138) 

Test  3 

tTI]_1  (u(m,  n)  —  u(m,  n))  (139) 

The  results  for  the  Irena  movies  are  shown  in  Table  5: 


Test  2 

Test  2  half 

Test  3 

Irena  Movie 

Success  Rate 

i 

i 

i 

Score  Rate 

i 

i 

i 

Stanley  Movie 

Maximum  TTP 

0.665 

0.6653 

0.1111 

Table  5:  The  Table  of  results  for  the  Irena  Movie 

Here  is  an  example  of  a  target  path  and  tracking  result  from  an  Irena 
movie  with  Test  3.  in  Fig.  8. 

Note  that  I  compare  the  results  to  the  Stanley  movies.  The  Stanley 
movies  are  movies  without  targets  implanted.  The  score  for  evaluating  the 
Stanley  Movies  is  the  maximum  TTP  amongst  the  tracks  that  survive  for 
over  50%  of  the  tracking  process.  The  Maximum  TTP  for  the  Stanley  movies 
have  to  be  very  low,  i.e.  ,  at  the  very  least,  lower  than  0.9.  For  example  in 
Table  5  we  get  0.665  for  Test  2  on  a  Stanley  movie.  This  means  that  6  of 
the  9  movies  have  a  maximum  TTP  of  almost  1.  In  other  words  we  have  a 
false  track  rate  of  6  out  9  when  there  is  no  target.  For  a  real  system  this 
is  not  acceptable  Otherwise  this  means  that  the  algorithm  will  track  clutter 
when  there  are  no  targets  which  isn’t  very  useful. 

The  Louisa  movies  are  divided  into  4  different  types  according  to  the 
intensity  of  the  target.  The  intensity  goes  from  1  to  4  with  1  being  the  most 
intense. The  results  are  shown  in  the  following  tables: 
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Tracking  Results:  Estimated  tracks-blue,  Real  track-green 


Figure  8:  The  target  starts  in  left  upper  corner  and  proceeds  to  move  diag¬ 
onally  over  the  frame.  The  zigzag  motion  is  due  to  the  pixel  entering  and 
leaving  a  pixel.The  green  line  is  the  path  of  center  the  target  and  the  blue 
line  is  the  result  of  the  tracking  algorithm. 


Test  2 

Test  2  half 

Test  3 

Louisa  Movie 

Success  Rate 

0.7778 

i 

i 

Score  Rate 

0.6881 

0.9142 

0.9142 

Stanley  Movie 

Maximum  TTP 

0.665 

0.6653 

0.1111 

Table  6:  The  Table  of  results  for  the  Louisa  Movies  intensity  1 


Test  2 

Test  2  half 

Test  3 

Louisa  Movie 

Success  Rate 

0.5556 

0.6667 

0.6667 

Score  Rate 

0.4937 

0.6115 

0.6115 

Stanley  Movie 

Maximum  TTP 

0.665 

0.6653 

0.1111 

Table  7:  The  Table  of  results  for  the  Louisa  Movies  intensity  2 


Test  2 

Test  2  half 

Test  3 

Louisa  Movie 

Success  Rate 

0 

0.4444 

0.4444 

Score  Rate 

0 

0.3889 

0.3889 

Stanley  Movie 

Maximum  TTP 

0.665 

0.6653 

0.1111 

Table  8:  The  Table  of  results  for  the  Louisa  Movies  intensity  3 


The  only  test  that  provides  acceptable  results  for  both  movies  is  Test  3. 
The  reason  is  that  the  clutter  isn’t  white  for  Test  2  and  Test  2=  therefore  the 
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Test  2 

Test  2  half 

Test  3 

Louisa  Movie 

Success  Rate 

0 

0 

0 

Score  Rate 

0 

0 

0 

Stanley  Movie 

Maximum  TTP 

0.665 

0.6653 

0.1111 

Table  9:  The  Table  of  results  for  the  Louisa  Movies  intensity  4 


algorithm  locks  on  to  clutter  usually  edges  of  clouds.  The  tracking  results 
for  Test  were  good  and  usually  equal  to  those  of  Test  3.  However  the 
results  when  no  target  was  implanted  are  not  good  for  Test  2^.  It  might 
be  possible  to  eliminate  the  problem  by  better  pre-processing  methods,  then 
Test  2^  might  provide  better  results  also  when  there  is  no  target.  In  order  to 
be  worth  the  effort  a  pre-processing  method  would  have  to  be  more  efficient 
than  estimating  the  covariance  matrix. 

It  can  be  seen  from  the  Louisa  movies  that  a  higher  intensity  target  has 
a  higher  chance  of  being  tracked. 
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8  Conclusions 


There  are  five  principal  issues  which  have  been  addressed  in  order  to  get  the 
IMM  approach  to  work  in  a  high  clutter  environment. 

•  The  pre-processing  level  is  vital  to  successful  tracking  among  the  more 
common  approaches  we  have  applied  linear  filtering  and  ordered  sta¬ 
tistics  filter.  In  the  future,  beside  searching  for  the  best  method  of 
feature  extraction,  we  should  consider  combining  different  methods  of 
feature  extraction/pre-processing  methods. 

•  Normalization  of  background  noise  and  time  dynamic  thresholding; 

—  Normalization  of  background  noise-The  normalization  of  the  back¬ 
ground  noise  is  important  for  the  tracking  process.  This  was  done 
by  dividing  each  pixel  by  a  constant  and  the  local  standard  devi¬ 
ation  around  each  pixel.  A  method  for  calculating  the  constant 
was  described. 

—  Time  dynamic  thresholding-  Raising  or  lowering  the  threshold  in 
the  case  of  too  many  tracks  or  no  tracks  at  all. 

•  When  tracking  in  an  environment  with  a  low  SNR  it  is  necessary  to 
continually  track  a  number  of  high  probability  tracks  simultaneously. 
That  is  a  good  way  to  keep  tracking  the  target  if  its  probability  drops. 
This  makes  it  possible  to  regain  acquisition  of  the  target,  if  it  disap¬ 
pears  and  then  reappears.  In  fact  this  is  the  same  approach  taken 
for  the  track  initiation  stage.  In  future  work,  the  current  algorithm, 
which  is  better  suited  to  a  single  target,  can  be  improved  significantly 
by  using  the  multiple  target  approach.  The  various  ways  to  do  this 
are  the  JPDAF  or  MHT. 

•  The  conclusions  for  the  Collapsed  Hyperspectral  movies  are  as  ex¬ 
pected.  The  higher  the  intensity  of  the  implanted  target,  the  better 
chance  there  is  of  tracking  it.  In  order  to  reduce  the  probability  of 
tracking  clutter,  the  clutter  must  be  whitened.  The  results  show  that 
the  Matched  Filter  does  a  good  job  at  this. 

•  The  model  used  for  the  amplitude  of  the  target  is  in  fact  crucial  for 
successfully  tracking  the  target.  This  point  target  is  in  general,  difficult 
to  model;  however,  a  satisfactory  solution  was  found  for  this  problem. 
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Appendixes 

A  Calculating  Pq  explicitly 

I  will  show  how  to  explicitly  calculate  the  gate  probability  Pq  for  nz  =  1,  2,  3. 
The  first  dimension  is  easy 
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hi 


\/27T  J-g 

i  2  r9 


exp  |  —  -z2  )  dz 


1 


v2^Jo  expi-^2r 

2  fa  f  1  o )  dz 

7^1  ew\~2z  J  7! 

—j=  /  exp{— 

V ' 7T  JO 


=  erf 


For  the  second  dimension  nz  =  2 
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This  is  done  by  using  the  following  transform  from  cartesian  coordinates  to 
polar  coordinates. 


zi  =  rcos</>  (142) 

Z2  =  rsin0  (143) 


For  the  third  dimension 
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Note  that  once  again  I  transformed  the  cartesian  coordinates  to  spherical 
coordinates  with  the  following  transformation 


*1 

=  r  cos  (f)  sin  6 

(145) 

^2 
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^3 

=  r  cos  6 

(147) 

Expression  (144)  can  be  turned  into  an  incomplete  gamma  function  by 
a  change  of  variables.  The  incomplete  gamma  function  is  defined  as 

P(x,  a)  =  [  ta~1e~tdt  (148) 

!»  Jo 

A  change  of  variables  is  carried  out  by  defining  t  =  \r2  and  then  we  get 
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where  T  (|)  =  ^ 
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Inductively  it  can  be  shown  that  Pq  as  a  function  of  7  and  nz  is 


PG(7,n,)  =  pg, 


n7 


(150) 
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